RateLimiter Service Limiting Implementation

Catalog

Service Limiting

demand

1. Control the service flow of a single machine to avoid abnormal service caused by sudden large traffic.2. No intrusion into business.

algorithm

Current limiting methods are mainstream:

  • Limit flow by limiting the number of calls per unit time period
  • Limit flow by limiting the degree of concurrent calls to the system
  • Use Leaky Bucket algorithm to limit flow
  • (Use Token Bucket algorithm to limit flow

Limit flow by limiting the number of calls per unit time period

Limit flow by limiting the amount of usage per unit time of a service.What we need to do is to count the number of visits to a service over a period of time by a counter if it exceeds the threshold we set.
Then continue access is not allowed within the unit time period, or queue the next requests until the next unit time period.

  • Advantages: Simple implementation and dynamic configurability of thresholds.
  • Disadvantages: If a large amount of traffic is consumed within a short period of time before a unit time, it will result in a denial of service for the remaining time in that period.This phenomenon is "spur consumption".

Limit flow by limiting the degree of concurrent calls to the system

By restricting the flow through concurrency, we limit the concurrent access to a service by restricting its concurrent access. In fact, we limit the access within a service unit's time period.
For example, restricting concurrent access to a service is 100, while the average processing time of a service is 10 milliseconds, which can provide (1000 / 10) * 100 = 10,000 times per second.

  • Advantages: There is a more restrictive boundary, which is suitable for a limit on the number of connections and threads.
  • Disadvantages: For services, it is difficult to tune the concurrency threshold, and it is difficult to accurately determine how appropriate the service threshold setting is.Semaphore is generally implemented, but Semaphore does not provide a way to reset semaphores, so dynamic threshold configuration is also a problem.

Leaky bucket algorithm

Request traffic requests resources at an indeterminate rate, and program processing takes place at a constant rate, which is the basic principle of the leaky bucket algorithm.

The funnel has a water inlet and a water outlet which discharge water at a certain rate and has a maximum water outlet rate:

  • When there is no water in the funnel
    • If the water inlet rate is less than or equal to the maximum water outlet rate, then the water outlet rate is equal to the water inlet rate, and no water is accumulated at this time
    • If the water inlet rate is greater than the maximum water outlet rate, the funnel will discharge water at the maximum rate, at which time excess water will accumulate in the funnel
  • When there is water in the funnel
    • Outlet outlet to discharge water at maximum rate
    • If the funnel is not full and there is water in it, the water will accumulate in the funnel

If the funnel is full and there is water in it, the water will overflow outside the funnel.

  • Advantage: No matter how large the sudden flow rate is, the leaky bucket guarantees a constant flow rate output.
  • Disadvantages: The leaky bucket has a constant flow rate, which means that if there is an instantaneous high flow rate, most requests will be discarded.

Token Bucket

For many applications, in addition to limiting the average transmission rate of data, some degree of burst transmission is required.The leaky bucket algorithm may not be appropriate at this time, and the token bucket algorithm is more suitable.

The principle of the Token Bucket algorithm is that the system generates tokens at a constant rate, then puts them in the Token Bucket, which has a capacity to put tokens into it when the Token Bucket is full.
The extra token is discarded; when you want to process a request, you need to take a token out of the token bucket, and if there is no token in the bucket at this time, you reject the request.

  • Advantages: The token bucket algorithm allows some burst calls while limiting the average rate of calls.guava RateLimiter is based on the token bucket algorithm, so the code implementation is simple.The token generation rate can be dynamically configured.
  • Disadvantages:

Based on the above four algorithms, the token bucket can not only limit the average rate of calls, but also allow a certain degree of burst calls, which will not cause a large number of requests to be discarded. It is more flexible and easy to implement code.To sum up: It is recommended to select the Token Bucket algorithm for current limiting.

Code

Current Limiting Design


The request is intercepted by facets to determine if the restriction is turned on, token acquisition is performed if it is turned on, and business is performed if it is successful, or the request is discarded.
Token acquisition:

  1. No configuration timeout, get results directly
  2. Configure a timeout period during which tokens are continuously attempted.

Environment Configuration

<dependency>
     <groupId>org.springframework.boot</groupId>
     <artifactId>spring-boot-starter-web</artifactId>
     <version>2.1.1.RELEASE</version>
</dependency>
<dependency>
      <groupId>org.springframework.boot</groupId>
      <artifactId>spring-boot-starter-aop</artifactId>
      <version>2.1.1.RELEASE</version>
</dependency>
<dependency>
    <groupId>com.google.guava</groupId>
    <artifactId>guava</artifactId>
    <version>18.0</version>
</dependency>

configuration file

ratelimit.properties

# Configuration of current limiting module

# Whether to turn on current limiting
ratelimit.doRateLimit=false

# Configuration timeout (configuration will wait for acquisition, no configuration will return directly), in milliseconds
ratelimit.waitTimeout=20

# Service Limit Protection, TPS allowed per second by service (maximum TPS allowed by a single service needs to be evaluated)
ratelimit.permitsPerSecond=200

RateLimitConfig

/**
 * Current Limiting Configuration Information
 * 
 */
@Data
@Component
@ConfigurationProperties(prefix = "ratelimit")
@PropertySource(value = "classpath:ratelimit.properties")
public class RateLimitConfig implements Serializable {
    
    private static final long serialVersionUID = 1L;
    private boolean           doRateLimit      = false;
    private long              waitTimeout;
    private long              permitsPerSecond;

}

Current Limiting Service

/**
 * Current Limiting Service Interface
 * 
 */
public interface IRateLimitService {

    /**
     * Attempt to obtain a license, get 1, return immediately to non-blocking
     * 
     * @return
     */
    boolean tryAcquire();

    /**
     * Attempt to acquire multiple licenses and return immediately to non-blocking
     * 
     * @param permits
     * @return
     */
    boolean tryAcquire(int permits);

    /**
     * Block license acquisition, get 1, and return false if no license is obtained beyond timeout
     * 
     * @param timeout
     * @return
     */
    boolean acquire(long timeout);

    /**
     * Block acquisition of multiple licenses, return false if no license is acquired beyond timeout
     * 
     * @param permits
     * @param timeout
     * @return
     */
    boolean acquire(int permits, long timeout);

}
/**
 * Guava RateLimiter Current Limiting Implementation
 * 
 */
@Service
public class RateLimitServiceImpl implements IRateLimitService {

    private RateLimitConfig config;

    private RateLimiter rateLimiter;

    @Autowired
    public RateLimitServiceImpl(RateLimitConfig config) {
        this.config = config;
        this.rateLimiter = RateLimiter.create(this.config.getPermitsPerSecond());
    }

    @Override
    public boolean tryAcquire() {
        return this.tryAcquire(1);
    }

    @Override
    public boolean tryAcquire(int permits) {
        return rateLimiter.tryAcquire(permits);
    }

    @Override
    public boolean acquire(long timeout) {
        return this.acquire(1, timeout);
    }

    @Override
    public boolean acquire(int permits, long timeout) {
        long start = System.currentTimeMillis();
        for (;;) {
            boolean tryAcquire = rateLimiter.tryAcquire(permits);
            if (tryAcquire) {
                return true;
            }
            long end = System.currentTimeMillis();
            if ((end - start) >= timeout) {
                return false;
            }
        }
    }

}

Section Interception

/**
 * Limiting tangent
 * 
 */
@Aspect
@Order(-1)
@Component
public class RateLimitAspect {
    private static final Logger LOG = LoggerFactory.getLogger(RateLimitAspect.class);

    private RateLimitConfig config;

    private IRateLimitService rateLimitService;

    @Autowired
    public RateLimitAspect(RateLimitConfig config, IRateLimitService rateLimitService) {
        this.config = config;
        this.rateLimitService = rateLimitService;
    }

    @Pointcut("execution(public * xxx.xxx.*Controller.*(..))")
    public void executionMethod() {}

    @Around(value = "executionMethod()")
    public Object doRateLimit(ProceedingJoinPoint pjp) throws Throwable {
        if (LOG.isDebugEnabled()) {
            LOG.debug("Enter Limit Processing Face!");
        }
        Object result = null;
        // Determine whether to limit current
        try {
            if (config.isDoRateLimit()) {
                // Turn on current limiting
                boolean acquireResult = false;
                // 1. Check to see if timeout is configured
                if (config.getWaitTimeout() == 0L) {
                    // 2. Get a token
                    acquireResult = rateLimitService.tryAcquire();
                } else {
                    // 2. Acquire tokens, acquire tokens within a time-out period
                    acquireResult = rateLimitService.acquire(config.getWaitTimeout());
                }
                if (acquireResult) {
                    // 3. Successfully acquire token and release
                    result = pjp.proceed();
                } else {
                    // 3. Failed to get token, returned error code 429 => to many requests
                    result = ResponseEntity.status(HttpStatus.TOO_MANY_REQUESTS).build();
                }
            } else {
                // No open current limit, direct release
                result = pjp.proceed();
            }
        } catch (Throwable e) {
            throw e;
        }

        if (LOG.isDebugEnabled()) {
            LOG.debug("End of Limiting Process Cut!");
        }
        return result;
    }

}

test

testing environment

  • Jmeter Test
  • Request data: {"name": "Zhang San", "age":12}
  • Response results: {"name": "Zhang San", "age":12}
  • Business processing time: 100ms
  • Set TPS:1500

test result

Transaction per Second(TPS)

Hits per Second (requests per second)

response time

Aggregate Report

  • Successful request:

  • Error request:

  • All requests:

summary

  1. TPS remains stable at 1500 requests per second for over 20,000 requests.
  2. Apply senselessness in a faceted manner.

Tags: Java Spring less Google

Posted on Wed, 28 Aug 2019 18:41:31 -0700 by aspekt9