Implementation of Distributed Load Balancing Algorithms

In distributed projects, in order to improve the availability of the system, service providers usually do cluster processing. When one of the services goes down, other services in the cluster can still provide services, so as to improve the reliability of the system.

The commonly used load balancing algorithms are:

  • Stochastic algorithm
  • Weighted Random Algorithms
  • polling algorithm
  • Weighted Round Robin
  • Minimum Delay Algorithms
  • Consistency hash algorithm

Load balancing pursues the same load for each service provider and does not lead to load imbalance.

See github for all the following code: Algorithm implementation Test code

Get ready

This is a POJO of a service provider that contains information about the host s and port s of the service.

@Slf4j
@Data
@AllArgsConstructor
@NoArgsConstructor
public class ProviderConfig implements Serializable{

    private static final long serialVersionUID = 1;
    //Signal communication host
    private String host;
    //Communication port
    private Integer port;

    //Request interface name
    private String interfaceName;
    //Request method
    private String[] methods;
    //apply name
    private String application;
    //weight
    private int weight;
    //Calling time
    private int callTime;

}

 

 

Define load balancing policy interface

public interface LoadbalanceStrategy {
  //object is an extended parameter
    public ProviderConfig select(List<ProviderConfig> configs, Object object);
}

 

Stochastic algorithm

Random algorithm, that is to say, random selection from the list of services. If the random number generation algorithm is not good, it will lead to bias, resulting in high hit probability for some services, low hit probability for some services, and even zero hit rate for some services. In the end, the time delay with high hit rate will be very serious.

The advantage of stochastic algorithm is its simple implementation.

 

Realization

That is to generate a random number and select a very simple algorithm from the service List.

public class RandomLoadbalanceStrategy  implements LoadbalanceStrategy{

    @Override
    public ProviderConfig select(List<ProviderConfig> configs, Object object) {
        int index = new Random().nextInt(configs.size());
        return configs.get(index);
    }
}

 

test

Note: This method will be used for testing later on.

//Load Balancing strategy: Random, Weighted Random, Polling, Weighted Polling
//CongNum Number of Producers
//TesCount
public void loadbalace(LoadbalanceStrategy strategy ,int configNum,int testCount ){

List<ProviderConfig> configs = new ArrayList<>();
int[] counts = new int[configNum];


for(int i = 0; i< configNum; i++){
ProviderConfig config = new ProviderConfig();
config.setInterfaceName("com.serviceImpl");
config.setHost("127.0.0.1");
config.setPort(i);
config.setWeight(new Random().nextInt(100));
configs.add(config);
}

//System.out.println(configs);

for(int i = 0; i< testCount ; i++){
ProviderConfig config = strategy.select(configs,null);
// System.out.println("Selected:"+config);
Integer count = counts[config.getPort()];
counts[config.getPort()] = ++count;

}

for(int i = 0; i< configNum; i++){
System.out.println("Serial number:" + i + " Weight:" + configs.get(i).getWeight() + "--Frequency:" + counts[i]);
}

}

 

Execution testing

LoadbalanceStrategy strategy1 = new RandomLoadbalanceStrategy();
loadbalace(strategy1,10,1000);

 

output

Random Load Balancing...
Number: 0 - Number: 98
Number: 1 - Number: 97
Number: 2 - Number: 86
Number: 3 - Number: 99
Number: 4 - Number: 116
Number: 5 - Number: 98
Number: 6 - Number: 96
Number: 7 - Number: 102
Number: 8 - Number: 101
Number: 9 - Number: 107

 

From the test results, Jdk's random algorithm is relatively uniform.

Weighted Random Algorithms

Weighted randomization is to add a weight to each service on the basis of random algorithm. The larger the weight, the greater the probability.

When applications are deployed in a distributed way, the differences in hardware performance and environment will lead to inconsistencies in service performance.

In order to solve this problem, we can reduce the weight of service with poor performance and increase the weight of service with good performance so as to achieve the effect of load balancing as far as possible.

 

Weighted Random Algorithms

Realization


public
class WeightRandomLoadbalanceStrategy implements LoadbalanceStrategy{ @Override public ProviderConfig select(List<ProviderConfig> configs, Object object) { List<ProviderConfig> newConfigs = new ArrayList<>(); for(ProviderConfig config:configs){ for(int i = 0; i< config.getWeight(); i++){ newConfigs.add(config); } } int index = new Random().nextInt(newConfigs.size()-1); return newConfigs.get(index); } }

 

Or use the test code load balace (Loadbalance Strategy, int configNum, int testCount) to test.

System.out.println("Weighted Stochastic Load Balancing....");
LoadbalanceStrategy strategy2 = new WeightRandomLoadbalanceStrategy();
loadbalace(strategy1,10,1000);

 

Test results:

Weighted stochastic load balancing...
Number: 0 Weight: 44 - Number: 101
Number: 1 Weight: 27 - Number: 63
Number: 2 Weights: 22 - Numbers: 47
Number: 3 weights: 61 - times: 134
Number: 4 weights: 97 - times: 214
Number: 5 weights: 38 - times: 72
Number: 6 weights: 42 - times: 79
Serial number: 7 weights: 51 - times: 113
Number: 8 weights: 16 - times: 28
Number: 9 Weights: 67 - Numbers: 149

 

As you can see, the bigger the weight, the bigger the hit probability page.

polling algorithm

The polling algorithm is to poll all services, and the probability of hitting each service is the same. The disadvantage of the polling algorithm is the same as that of the random algorithm, or it can not solve the problem of machine performance difference.

 

Realization

public class PollingLoadbalanceStrategy implements LoadbalanceStrategy {

    //Use oneMap To cache polling indexes for each type of application
    private Map<String,Integer> indexMap = new ConcurrentHashMap<>();

    public ProviderConfig select(List<ProviderConfig> configs, Object object){

        Integer index = indexMap.get(getKey(configs.get(0)));
        if(index == null){
            indexMap.put(getKey(configs.get(0)),0);
            return configs.get(0);
        }
        else {
            index++;
            if(index >= configs.size()){
                index = 0;
            }
            indexMap.put(getKey(configs.get(0)),index);
            return configs.get(index);
        }
    }

    public String getKey(ProviderConfig config){

        return  config.getInterfaceName();
    }
}

 

test

Or use the above method

System.out.println("\r\n Polling load balancing.....");
LoadbalanceStrategy strategy3 = new PollingLoadbalanceStrategy();
 loadbalace(strategy3,10,1000);

 

test result

Polling load balancing...
Number: 0 Weight: 88 - Number: 100
Number: 1 Weight: 82 - Number: 100
Number: 2 Weights: 58 - Numbers: 100
Number: 3 weights: 68 - times: 100
Number: 4 weights: 67 - times: 100
Number: 5 weights: 57 - times: 100
Number: 6 weights: 19 - times: 100
Serial number: 7 weights: 43 - times: 100
Number: 8 weights: 4 - times: 100
Number: 9 weights: 35 - times: 100

 

As you can see, every application has the same probability of hitting 1000 times.

 

Weighted Round Robin

The principle is the same as the weighted random algorithm above.

Realization

public class WeightPollingLoadbalanceStrategy implements LoadbalanceStrategy {

    private Map<String,Integer> indexMap = new ConcurrentHashMap<>();

    public ProviderConfig select(List<ProviderConfig> configs, Object object){

        Integer index = indexMap.get(getKey(configs.get(0)));
        if(index == null){
            indexMap.put(getKey(configs.get(0)),0);
            return configs.get(0);
        }
        else {

            List<ProviderConfig> newConfigs = new ArrayList<>();

            for(ProviderConfig config:configs){

                for(int i = 0; i< config.getWeight(); i++){
                    newConfigs.add(config);
                }
            }
            index++;
            if(index >= newConfigs.size()){
                index = 0;
            }
            indexMap.put(getKey(configs.get(0)),index);
            return newConfigs.get(index);

        }
    }

    public String getKey(ProviderConfig config){

        return  config.getInterfaceName();
    }
}

 

test

System.out.println("\r\n Weighted polling load balancing.....");
LoadbalanceStrategy strategy4 = new WeightPollingLoadbalanceStrategy();
loadbalace(strategy4,10,1000);

 

output

Weighted polling load balancing...
Number: 0 Weight: 77 - Number: 182
Number: 1 Weight: 75 - Number: 150
Number: 2 Weights: 22 - Numbers: 44
Number: 3 weights: 43 - times: 86
Number: 4 Weights: 59 - Number: 118
Number: 5 weights: 10 - times: 20
Number: 6 weights: 1 - times: 2
Serial number: 7 weights: 25 - times: 50
Number: 8 weights: 85 - times: 170
Number: 9 Weights: 89 - Numbers: 178

 

As you can see, the greater the weight, the greater the probability of hitting.

 

Minimum Delay Algorithms

Due to the differences of machine performance and network transmission, different application calls in cluster will take different time.

If we can reduce the hit rate of the application which takes a long time to call, improve the hit rate of the application which takes a short time to call, and achieve dynamic adjustment, so as to achieve the ultimate load balance, then we can solve the problem of the above performance differences.

The disadvantage is that Realization is more complex because it takes time to calculate the average call after startup.

 

Realization

public class LeastActiveLoadbalanceStrategy implements  LoadbalanceStrategy{

    public ProviderConfig select(List<ProviderConfig> configs, Object object){

        ProviderConfig[] registryConfigs= new ProviderConfig[configs.size()];
        configs.toArray(registryConfigs);

        Arrays.sort(registryConfigs, new Comparator<ProviderConfig>() {
            @Override
            public int compare(ProviderConfig o1, ProviderConfig o2) {

                if(o1.getCallTime() < o2.getCallTime()){
                    return -1;
                }

                else  if(o1.getCallTime() == o2.getCallTime()){
                    return 0;
                }
                else {
                    return 1;
                }
            }
        });

        return registryConfigs[0];
    }
}

 

Arrays.sort() is used here to achieve time-consuming sorting.

 

test

public void leastActiveLoadbalance(LoadbalanceStrategy strategy ,int configNum){

        List<ProviderConfig> configs = new ArrayList<>();

        for(int i = 0; i< configNum; i++){
            ProviderConfig config = new ProviderConfig();
            config.setInterfaceName("com.serviceImpl");
            config.setHost("127.0.0.1");
            config.setPort(i);
            config.setWeight(i);
       //Random numbers are used here to simulate the time-consuming call. config.setCallTime(
new Random().nextInt(100)); configs.add(config); } for(ProviderConfig c:configs){ System.out.println("Serial number:" + c.getPort() +"--time delay:" + c.getCallTime() ); } System.out.println("--------------"); ProviderConfig config = strategy.select(configs,null); System.out.println("Final choice Serial number:" + config.getPort() +"--time delay:" + config.getCallTime() ); }

 

Random numbers are used here to simulate the time-consuming call.

System.out.println("\r\n Minimum Delay Load Balancing.....");
LoadbalanceStrategy strategy5 = new LeastActiveLoadbalanceStrategy();
leastActiveLoadbalance(strategy5,10);

 

test result

Minimum delay load balancing...
Number: 0 - Delay: 83
Number: 1 - Delay: 3
Number: 2 - Delay: 60
Number: 3 - Delay: 52
Number: 4 - Delay: 73
Number: 5 - Delay: 74
Number: 6 - Delay: 37
Number: 7 - Delay: 59
Number: 8 - Delay: 83
Number: 9 - Delay: 2
--------------
Final Choice Number: 9 - Delay: 2

You can see that you hit the least time-consuming application.

 

Consistency hash algorithm

Firstly, we construct an integer ring of 232 length (this ring is called consistent Hash ring). According to the Hash value of the node name (its distribution is [0, 232-1]), the server node is placed on the Hash ring.

Then the Hash value is calculated according to the key value of the data (its distribution is also [0,232-1]), and then the server node closest to the Hash value is found clockwise on the Hash ring to complete the mapping of the Key to the server.

Consistency hash algorithm can also realize that a consumer hits a service provider all the time.

 

As shown below, there are four service providers

provider-1: 127.0.0.1:8001

provider-2: 127.0.5.2:8145

provider-3: 127.0.1.2:8123

provider-4: 127.1.3.2:8256

After hash calculation, four nodes are distributed in different positions of hash ring.

When a consumer (127.0.0.1:8011) calculates through hash and locates it at the location shown in the figure, it will clockwise look for the next node and select the first one to find.

 

There are several key issues:

1. The influence of hash algorithm

If the hash algorithm results are too centralized, as shown in the following figure, the distribution of nodes in a small range, if the majority of consumers hit outside the range, will lead to node 1 load abnormal large, the problem of load imbalance.

So we need a better hash algorithm.

The solution to this problem is to choose a good hashcode algorithm. hash algorithm comparison

 

 

2. Increasing or deleting nodes results in unbalanced load

The following picture:

Normally, each node has a 25% hit probability.

When node node 2 fails, all hits from previous node 2 are added to node 3, resulting in greater load on node 3.

When node 5 is added, the hits of node 3 are all given to node 5, and the load is unbalanced.

The solution to this problem is to add virtual nodes.

As shown in the following figure, adding virtual nodes to each node can make the distribution of hash ring more uniform. But one problem is that the more nodes, the greater the performance of maintenance. Therefore, how many virtual nodes need to be added, which needs to be tested according to actual needs.

 

Realization

The format of virtual node is 127.0.0.1:8001 &&node1.

The hashcode algorithm of jdk is compared with FNV1_32_HASH algorithm. .

public class UniformityHashLoadbalanceStrategy  implements  LoadbalanceStrategy{

    private static final int VIRTUAL_NODES = 5;

   //host:port for consumers
    public ProviderConfig select(List<ProviderConfig> configs, Object object){

        SortedMap<Integer, ProviderConfig> sortedMap = new TreeMap();

        for(ProviderConfig config:configs){
            for(int j = 0; j < VIRTUAL_NODES; j++){
                sortedMap.put(caculHash(getKey(config.getHost(),config.getPort(),"&&node"+j)),config);
            }
        }

        System.out.println(sortedMap);
        Integer requestHashcCode = caculHash((String)object);
        sortedMap.forEach((k,v)->{
            System.out.println("hashcode: " + k + "  " + v.getHost()+":"+v.getPort());
        });
        System.out.println("------------------Requested hashcode:"+requestHashcCode);

        SortedMap<Integer, ProviderConfig> subMap = sortedMap.tailMap(requestHashcCode);
        Integer index = subMap.firstKey();
        return  subMap.get(index);
    }
    private String getKey(String host,int port,String node){
        return new StringBuilder().append(host).append(":").append(port).append(node).toString();
    }

  //Calculate hash value
private int caculHash(String str){      //hashcode of jdk /*int hashCode = str.hashCode(); hashCode = (hashCode<0)?(-hashCode):hashCode; return hashCode;*/
     //FNV1_32_HASH algorithm final int p = 16777619; int hash = (int)2166136261L; for (int i = 0; i < str.length(); i++) hash = (hash ^ str.charAt(i)) * p; hash += hash << 13; hash ^= hash >> 7; hash += hash << 3; hash ^= hash >> 17; hash += hash << 5; // If the calculated value is negative, take its absolute value. if (hash < 0) hash = Math.abs(hash); return hash; } }

 

Test:

public void uniformityHashLoadbalanceStrategyTest(LoadbalanceStrategy strategy ,int configNum){

        List<ProviderConfig> configs = new ArrayList<>();
        for(int i = 0; i< configNum; i++){
            ProviderConfig config = new ProviderConfig();
            config.setInterfaceName("com.serviceImpl");
            config.setHost("127.0.0.1");
            config.setPort(new Random().nextInt(9999));
            config.setWeight(i);
            config.setCallTime(new Random().nextInt(100));
            configs.add(config);
        }

        ProviderConfig config = strategy.select(configs,"127.0.0.1:1234");
        System.out.println("Selection result:" + config.getHost() + ":" + config.getPort());
    }

 

 

 System.out.println("\r\n Uniformity hash load balancing.....");
uniformityHashLoadbalanceStrategyTest(new UniformityHashLoadbalanceStrategy(),10);

 

 

1. jdk hashcode algorithm

hashcode: 441720772  127.0.0.1:1280
hashcode: 441720773  127.0.0.1:1280
hashcode: 441720774  127.0.0.1:1280
hashcode: 441720775  127.0.0.1:1280
hashcode: 441720776  127.0.0.1:1280
hashcode: 1307619854  127.0.0.1:3501
hashcode: 1307619855  127.0.0.1:3501
hashcode: 1307619856  127.0.0.1:3501
hashcode: 1307619857  127.0.0.1:3501
hashcode: 1307619858  127.0.0.1:3501
hashcode: 1363372970  127.0.0.1:779
hashcode: 1363372971  127.0.0.1:779
hashcode: 1363372972  127.0.0.1:779
hashcode: 1363372973  127.0.0.1:779
hashcode: 1363372974  127.0.0.1:779
hashcode: 1397780469  127.0.0.1:5928
hashcode: 1397780470  127.0.0.1:5928
hashcode: 1397780471  127.0.0.1:5928
hashcode: 1397780472  127.0.0.1:5928
hashcode: 1397780473  127.0.0.1:5928
hashcode: 1700521830  127.0.0.1:4065
hashcode: 1700521831  127.0.0.1:4065
hashcode: 1700521832  127.0.0.1:4065
hashcode: 1700521833  127.0.0.1:4065
hashcode: 1700521834  127.0.0.1:4065
hashcode: 1774961903  127.0.0.1:5931
hashcode: 1774961904  127.0.0.1:5931
hashcode: 1774961905  127.0.0.1:5931
hashcode: 1774961906  127.0.0.1:5931
hashcode: 1774961907  127.0.0.1:5931
hashcode: 1814135809  127.0.0.1:5050
hashcode: 1814135810  127.0.0.1:5050
hashcode: 1814135811  127.0.0.1:5050
hashcode: 1814135812  127.0.0.1:5050
hashcode: 1814135813  127.0.0.1:5050
hashcode: 1881959435  127.0.0.1:1991
hashcode: 1881959436  127.0.0.1:1991
hashcode: 1881959437  127.0.0.1:1991
hashcode: 1881959438  127.0.0.1:1991
hashcode: 1881959439  127.0.0.1:1991
hashcode: 1889283041  127.0.0.1:4071
hashcode: 1889283042  127.0.0.1:4071
hashcode: 1889283043  127.0.0.1:4071
hashcode: 1889283044  127.0.0.1:4071
hashcode: 1889283045  127.0.0.1:4071
hashcode: 2118931362  127.0.0.1:7152
hashcode: 2118931363  127.0.0.1:7152
hashcode: 2118931364  127.0.0.1:7152
hashcode: 2118931365  127.0.0.1:7152
hashcode: 2118931366  127.0.0.1:7152
------------------Requested hashcode:35943393
//Selection results:127.0.0.1:1280

 

You can see the problem of JDK's default hashcode method, each virtual node is relatively centralized, and there will be a very serious load imbalance problem.

2. Using FNV1_32_HASH algorithm

hashcode: 23525275  127.0.0.1:1340
hashcode: 28411340  127.0.0.1:2589
hashcode: 43226263  127.0.0.1:1340
hashcode: 117776908  127.0.0.1:2848
hashcode: 190736195  127.0.0.1:6316
hashcode: 193232709  127.0.0.1:2848
hashcode: 238271827  127.0.0.1:1340
hashcode: 304277099  127.0.0.1:1245
hashcode: 323839980  127.0.0.1:1340
hashcode: 527125849  127.0.0.1:1704
hashcode: 571281671  127.0.0.1:1704
hashcode: 575621854  127.0.0.1:3295
hashcode: 595383578  127.0.0.1:8294
hashcode: 611098804  127.0.0.1:6316
hashcode: 635348472  127.0.0.1:1245
hashcode: 645589927  127.0.0.1:2923
hashcode: 724716685  127.0.0.1:2589
hashcode: 751044093  127.0.0.1:6316
hashcode: 770120156  127.0.0.1:6316
hashcode: 826884147  127.0.0.1:1340
hashcode: 828747447  127.0.0.1:3295
hashcode: 845151891  127.0.0.1:2923
hashcode: 1009576302  127.0.0.1:8294
hashcode: 1012854319  127.0.0.1:3295
hashcode: 1128338730  127.0.0.1:2848
hashcode: 1193699998  127.0.0.1:1245
hashcode: 1256565279  127.0.0.1:5227
hashcode: 1262478155  127.0.0.1:2848
hashcode: 1313976900  127.0.0.1:1704
hashcode: 1316234142  127.0.0.1:8294
hashcode: 1329044824  127.0.0.1:2589
hashcode: 1338628917  127.0.0.1:8294
hashcode: 1376780501  127.0.0.1:2923
hashcode: 1400591219  127.0.0.1:8294
hashcode: 1464238680  127.0.0.1:1704
hashcode: 1477774526  127.0.0.1:3295
hashcode: 1508307156  127.0.0.1:5227
hashcode: 1519292522  127.0.0.1:1245
hashcode: 1624736872  127.0.0.1:1704
hashcode: 1734567634  127.0.0.1:5227
hashcode: 1742088861  127.0.0.1:6316
hashcode: 1786711526  127.0.0.1:2923
hashcode: 1799136540  127.0.0.1:3295
hashcode: 1871586245  127.0.0.1:2589
hashcode: 1879550454  127.0.0.1:2923
hashcode: 1882602635  127.0.0.1:5227
hashcode: 1921744078  127.0.0.1:2848
hashcode: 1925433450  127.0.0.1:2589
hashcode: 2087729269  127.0.0.1:5227
hashcode: 2138998633  127.0.0.1:1245
------------------Requested hashcode:2064286659
//Selection results:127.0.0.1:5227

 

It can be seen that the distribution of virtual nodes is relatively scattered, which can achieve better results.

 

summary

The above gives the implementation ideas and code implementation of each load balancing algorithm, test results.

It is summarized as follows:

 

Random algorithm:

A good random algorithm can make the selection more balanced, but there will still be differences in machine performance resulting in different call time-consuming. The advantage is simplicity of implementation.

Weighted stochastic algorithm:

Different weight ratios can be adjusted according to different machine performance, so as to reduce the problems caused by machine performance differences.

Polling algorithm:

It can make the selection probability of each node consistent, but there will also be the problem of stochastic algorithm.

Weighted polling:

Different weight ratios can be adjusted according to different machine performance, so as to reduce the problems caused by machine performance differences.

Minimum delay algorithm:

According to the time-consuming dynamic adjustment of service invocation, better load balancing can be achieved. The disadvantage is that the implementation is more complex.

Consistency hash algorithm:

It enables consumers to always correspond to a service provider. The disadvantage is that the implementation is relatively complex. At the same time, the problem of uneven distribution is solved by optimizing hashcode algorithm and adding virtual nodes.

 

Later words

The above implementations only provide a way of thinking, which should be tested and optimized according to the actual situation.

Tags: Java JDK github network Load Balance

Posted on Sun, 05 May 2019 22:40:38 -0700 by Billy2007