Throttling Method Calls to M Requests in N Seconds

java

rate-limiting

token-bucket

delay-queue

byAlex Kataev·Mar 8, 2025

Use a Semaphore and Guava's RateLimiter to limit method calls to a certain threshold. Set the rate limit at the start with the following formula: double permitsPerSecond = M / (N / 1000.0);. Then, invoke rateLimiter.acquire() before every method call to assure rate adherence.

import com.google.common.util.concurrent.RateLimiter;

public class Throttler {
    
    // Because we can't allow more threads than a cat has lives, right?
    private final RateLimiter rateLimiter;

    public Throttler(int maxRequests, int timeSpanSeconds) {
        // Don't let method time to say "I'll be back" to the queue.
        this.rateLimiter = RateLimiter.create(maxRequests / (timeSpanSeconds / 1000.0));
    }

    public void doCall() {
        rateLimiter.acquire(); // Keeping everything orderly, like in a British queue
        // Place your logic here
    }
}

Replace doCall's content with your method implementation. This pattern ensures M calls per N seconds, applying throttling when necessary.

A Deeper Dive: Advanced Throttling Methods

Just invoking rateLimiter.acquire() is the tip of the iceberg. Specific use cases may need the token bucket algorithm for more precise control or DelayQueue for accurate timing.

Grasping Time: Precision and Flexibility

For high resolution time requirements, augment the basic concept to support milliseconds or nanoseconds.

The Ring: Efficient Tracking with Buffers

Use a fixed-size ring buffer to optimize memory. It holds timestamps of the most recent M requests, facilitating a comparison against the oldest entry for incoming requests.

The Aspect Approach: A Cleaner Way to Code

For better maintainability and scalability, interface with AspectJ to keep throttling logic separate from the main business logic.

Redis: Throttling for Distributed Systems

In distributed environments, consider RedisRateLimit for tracking request rates across multiple application instances.

Alternative Approaches to Throttling

DelayQueue: Handling Spikes

For dealing with request bursts, employ a DelayQueue with Delayed instances. The service can handle spikes and not drop requests, queuing them for execution when allowed.

Token Bucket: Adaptive Rate Limiting

The Token bucket provides flexibility during peak loads, allowing bursts while enforcing long-term limits. The tokens refill over time, and requests spend tokens to proceed.

Ratelimitj-Redis: Configurable Rate Limiting

Explore Ratelimitj-Redis, a GitHub repository, supporting various rate limiting strategies with Redis backends. This enables varying rate limits by IP or user account, giving granularity to your resource access regulations.

explain-codes / Java / Throttling Method Calls to M Requests in N Seconds

Linked

Difference between throttling and debouncing a function



What is the fastest way to send 100,000 HTTP requests in Python?



Javascript/jquery: $(window).resize how to fire AFTER the resize is completed?



Multiprocessing vs multithreading vs asyncio



How do I measure time elapsed in Java?

