Handling Binance API Rate Limits: What Does 'Weight' Mean?

2026-04-23 · 11 min read

Rate limiting is a common pitfall for API strategies. This article explains the two types of limits—Weight and Order count—and how to implement graceful backoff.

Rate limiting is the most common trap for quantitative strategies. First, create an API Key on the Binance Official Website. For the app, use the Binance Official App (for iOS, see the iOS Installation Guide).

Two Dimensions of Rate Limits

The Binance API uses two independent dimensions to control traffic:

1. Weight

Every endpoint has an assigned "weight" value. The total weight of all requests per minute must not exceed a specific limit.

Level	Weight per Minute
Standard	1,200
VIP 1	2,400
VIP 2	3,600
...	Increases with VIP level

2. Order Count

There are additional limits specifically for order placement:

Time Window	Order Limit
10 Seconds	100
1 Day	200,000

Exceeding either limit results in a temporary ban.

Weight per Endpoint

Common examples:

Endpoint	Weight
Market Snapshot	1
Order Book (5 levels)	1
Order Book (100 levels)	5
Order Book (1,000 levels)	50
K-line / Candles	1
Place Order	1
Cancel Order	1
Account Information	10

Fetching a deep order book once consumes 50 weight, meaning you can only do it 24 times per minute.

Response Headers

Every API response includes headers to help you monitor usage:

X-MBX-USED-WEIGHT-1M: Weight used in the current minute.
X-MBX-ORDER-COUNT-10S: Orders placed in the current 10-second window.

Monitor these values to decelerate requests before hitting the wall.

Consequences of Overuse

429 Error: A temporary ban. Stop all requests and wait a few seconds.
418 Error: Triggered by repeatedly ignoring 429 errors. This results in an IP ban that can last 24 hours or even days. Never brute-force these errors.

Graceful Backoff

import time

while True:
    try:
        result = api_call()
        weight_used = int(headers.get('X-MBX-USED-WEIGHT-1M', 0))
        if weight_used > 1000:
            time.sleep(10)  # Slow down
        return result
    except RateLimitError:
        time.sleep(60)  # Wait for a full minute

Tips to Reduce Weight Consumption

1. Reduce Depth Levels

If you only need 5 levels of the order book, don't pull 1,000 levels. There is a 50x difference in weight.

2. Switch to WebSockets

Prices, order books, and K-lines can be streamed in real-time via WebSockets (WS). WS streams do not consume weight.

3. Use Batch Endpoints

The /api/v3/batchOrders endpoint allows you to place 5 orders at once for a lower weight cost than 5 individual calls.

4. Implement Caching

Cache data that doesn't need to be updated every single second.

WebSocket Advantages

Data Type	REST Weight	WebSocket
Market Price	1 per call	Streamed, 0 weight
Order Book	1–50 per call	Streamed, 0 weight
K-lines	1 per call	Streamed, 0 weight

Best Practice Strategy:

Use WebSockets for real-time data flow (0 weight consumption).
Use REST only for placing and cancelling orders.

Decentralization via Multiple Accounts

Using multiple API Keys across different accounts naturally spreads your weight consumption.

Strategy:

Use the Main account Key for order placement.
Use Sub-account Keys for market queries.
This effectively multiplies your available weight pool.

Boosting Weight with VIP Levels

Higher VIP levels offer:

Increased per-minute weight limits.
Higher limits for specific endpoints.
Benefits that can be even more valuable than fee discounts.

Even VIP 1 doubles your weight limit, which is highly beneficial for high-frequency strategies.

Monitoring Code Snippet

class RateMonitor:
    def __init__(self):
        self.last_weight = 0
        self.alert_threshold = 1000
    
    def update(self, headers):
        used = int(headers.get('X-MBX-USED-WEIGHT-1M', 0))
        if used > self.alert_threshold:
            print(f"WARN: {used} / 1200 used")
        self.last_weight = used

Common Mistakes

1. Using Only REST in a Loop

Placing orders while checking balances and K-lines every second via REST will cause your weight usage to skyrocket. Switch to WS for streaming.

2. Failing to Handle Exceptions

Sending requests immediately after a 429 error instead of sleeping will trigger a 418 permanent ban.

3. Multiple Keys on a Single IP

While Keys are separate, IP-based rate limits still exist. Multiple Keys on one IP may be counted together for certain limits.

4. Ignoring Weight Headers

Running requests blindly until you hit the limit is a recipe for disaster. Monitor the headers.

Pre-Live Testing

Run high-frequency tests on the testnet to see if you trigger 429 errors, then adjust.
Set up monitoring alerts for your live environment to warn you when weight usage approaches the limit so you can decelerate automatically.

FAQ

Q: How long does a 429 ban last? A: Usually a few seconds to a few minutes. Do not send any requests during this time.

Q: How long does a 418 ban last? A: Anywhere from 1 hour to 24 hours. In extreme cases, it can last 7 days.

Q: Can I appeal a ban? A: You can contact support, but it's rarely effective. You simply have to wait it out.

Q: Is the weight shared across all endpoints? A: Yes. All weight used across all endpoints adds up to your per-minute limit (e.g., 1,200/min).

Q: Is the Order limit per IP or per account? A: It is per account. Using multiple IPs for the same account will not bypass the order limit.

Handling Binance API Rate Limits: What Does 'Weight' Mean?

Two Dimensions of Rate Limits

1. Weight

2. Order Count

Weight per Endpoint

Response Headers

Consequences of Overuse

Graceful Backoff

Tips to Reduce Weight Consumption

1. Reduce Depth Levels

2. Switch to WebSockets

3. Use Batch Endpoints

4. Implement Caching

WebSocket Advantages

Decentralization via Multiple Accounts

Boosting Weight with VIP Levels

Monitoring Code Snippet

Common Mistakes

1. Using Only REST in a Loop

2. Failing to Handle Exceptions

3. Multiple Keys on a Single IP

4. Ignoring Weight Headers

Pre-Live Testing

FAQ

Further Reading

Start trading on Binance now

Category	Articles	Description
Website Access	5	Find and verify the genuine Binance website.
Mirror Sites	1	Backup mirrors and access channels.
Access Channels	15	Ways in when networks are restricted.
App Download	7	Android APK and iOS install tutorials.
Client Install	4	Desktop and mobile install guides.
Spot Orders	14	Spot buy/sell with limit and market orders.

Two Dimensions of Rate Limits

1. Weight

2. Order Count

Weight per Endpoint

Response Headers

Consequences of Overuse

Graceful Backoff

Tips to Reduce Weight Consumption

1. Reduce Depth Levels

2. Switch to WebSockets

3. Use Batch Endpoints

4. Implement Caching

WebSocket Advantages

Decentralization via Multiple Accounts

Boosting Weight with VIP Levels

Monitoring Code Snippet

Common Mistakes

1. Using Only REST in a Loop

2. Failing to Handle Exceptions

3. Multiple Keys on a Single IP

4. Ignoring Weight Headers

Pre-Live Testing

FAQ

Further Reading

Related

Start trading on Binance now