Trading Tools

Handling Binance API Rate Limits: What Does 'Weight' Mean?

2026-04-23 · 11 min read

Rate limiting is a common pitfall for API strategies. This article explains the two types of limits—Weight and Order count—and how to implement graceful backoff.

Rate limiting is the most common trap for quantitative strategies. First, create an API Key on the Binance Official Website. For the app, use the Binance Official App (for iOS, see the iOS Installation Guide).

Two Dimensions of Rate Limits

The Binance API uses two independent dimensions to control traffic:

1. Weight

Every endpoint has an assigned "weight" value. The total weight of all requests per minute must not exceed a specific limit.

Level Weight per Minute
Standard 1,200
VIP 1 2,400
VIP 2 3,600
... Increases with VIP level

2. Order Count

There are additional limits specifically for order placement:

Time Window Order Limit
10 Seconds 100
1 Day 200,000

Exceeding either limit results in a temporary ban.

Weight per Endpoint

Common examples:

Endpoint Weight
Market Snapshot 1
Order Book (5 levels) 1
Order Book (100 levels) 5
Order Book (1,000 levels) 50
K-line / Candles 1
Place Order 1
Cancel Order 1
Account Information 10

Fetching a deep order book once consumes 50 weight, meaning you can only do it 24 times per minute.

Response Headers

Every API response includes headers to help you monitor usage:

  • X-MBX-USED-WEIGHT-1M: Weight used in the current minute.
  • X-MBX-ORDER-COUNT-10S: Orders placed in the current 10-second window.

Monitor these values to decelerate requests before hitting the wall.

Consequences of Overuse

  • 429 Error: A temporary ban. Stop all requests and wait a few seconds.
  • 418 Error: Triggered by repeatedly ignoring 429 errors. This results in an IP ban that can last 24 hours or even days. Never brute-force these errors.

Graceful Backoff

import time

while True:
    try:
        result = api_call()
        weight_used = int(headers.get('X-MBX-USED-WEIGHT-1M', 0))
        if weight_used > 1000:
            time.sleep(10)  # Slow down
        return result
    except RateLimitError:
        time.sleep(60)  # Wait for a full minute

Tips to Reduce Weight Consumption

1. Reduce Depth Levels

If you only need 5 levels of the order book, don't pull 1,000 levels. There is a 50x difference in weight.

2. Switch to WebSockets

Prices, order books, and K-lines can be streamed in real-time via WebSockets (WS). WS streams do not consume weight.

3. Use Batch Endpoints

The /api/v3/batchOrders endpoint allows you to place 5 orders at once for a lower weight cost than 5 individual calls.

4. Implement Caching

Cache data that doesn't need to be updated every single second.

WebSocket Advantages

Data Type REST Weight WebSocket
Market Price 1 per call Streamed, 0 weight
Order Book 1–50 per call Streamed, 0 weight
K-lines 1 per call Streamed, 0 weight

Best Practice Strategy:

  • Use WebSockets for real-time data flow (0 weight consumption).
  • Use REST only for placing and cancelling orders.

Decentralization via Multiple Accounts

Using multiple API Keys across different accounts naturally spreads your weight consumption.

Strategy:

  • Use the Main account Key for order placement.
  • Use Sub-account Keys for market queries.
  • This effectively multiplies your available weight pool.

Boosting Weight with VIP Levels

Higher VIP levels offer:

  • Increased per-minute weight limits.
  • Higher limits for specific endpoints.
  • Benefits that can be even more valuable than fee discounts.

Even VIP 1 doubles your weight limit, which is highly beneficial for high-frequency strategies.

Monitoring Code Snippet

class RateMonitor:
    def __init__(self):
        self.last_weight = 0
        self.alert_threshold = 1000
    
    def update(self, headers):
        used = int(headers.get('X-MBX-USED-WEIGHT-1M', 0))
        if used > self.alert_threshold:
            print(f"WARN: {used} / 1200 used")
        self.last_weight = used

Common Mistakes

1. Using Only REST in a Loop

Placing orders while checking balances and K-lines every second via REST will cause your weight usage to skyrocket. Switch to WS for streaming.

2. Failing to Handle Exceptions

Sending requests immediately after a 429 error instead of sleeping will trigger a 418 permanent ban.

3. Multiple Keys on a Single IP

While Keys are separate, IP-based rate limits still exist. Multiple Keys on one IP may be counted together for certain limits.

4. Ignoring Weight Headers

Running requests blindly until you hit the limit is a recipe for disaster. Monitor the headers.

Pre-Live Testing

  • Run high-frequency tests on the testnet to see if you trigger 429 errors, then adjust.
  • Set up monitoring alerts for your live environment to warn you when weight usage approaches the limit so you can decelerate automatically.

FAQ

Q: How long does a 429 ban last? A: Usually a few seconds to a few minutes. Do not send any requests during this time.

Q: How long does a 418 ban last? A: Anywhere from 1 hour to 24 hours. In extreme cases, it can last 7 days.

Q: Can I appeal a ban? A: You can contact support, but it's rarely effective. You simply have to wait it out.

Q: Is the weight shared across all endpoints? A: Yes. All weight used across all endpoints adds up to your per-minute limit (e.g., 1,200/min).

Q: Is the Order limit per IP or per account? A: It is per account. Using multiple IPs for the same account will not bypass the order limit.

Further Reading

Rate limits aren't scary as long as you account for them. Monitor your weight, and you'll never face a permanent ban.