Handling Binance API Rate Limits: What Does 'Weight' Mean?
Rate limiting is a common pitfall for API strategies. This article explains the two types of limits—Weight and Order count—and how to implement graceful backoff.
Rate limiting is the most common trap for quantitative strategies. First, create an API Key on the Binance Official Website. For the app, use the Binance Official App (for iOS, see the iOS Installation Guide).
Two Dimensions of Rate Limits
The Binance API uses two independent dimensions to control traffic:
1. Weight
Every endpoint has an assigned "weight" value. The total weight of all requests per minute must not exceed a specific limit.
| Level | Weight per Minute |
|---|---|
| Standard | 1,200 |
| VIP 1 | 2,400 |
| VIP 2 | 3,600 |
| ... | Increases with VIP level |
2. Order Count
There are additional limits specifically for order placement:
| Time Window | Order Limit |
|---|---|
| 10 Seconds | 100 |
| 1 Day | 200,000 |
Exceeding either limit results in a temporary ban.
Weight per Endpoint
Common examples:
| Endpoint | Weight |
|---|---|
| Market Snapshot | 1 |
| Order Book (5 levels) | 1 |
| Order Book (100 levels) | 5 |
| Order Book (1,000 levels) | 50 |
| K-line / Candles | 1 |
| Place Order | 1 |
| Cancel Order | 1 |
| Account Information | 10 |
Fetching a deep order book once consumes 50 weight, meaning you can only do it 24 times per minute.
Response Headers
Every API response includes headers to help you monitor usage:
X-MBX-USED-WEIGHT-1M: Weight used in the current minute.X-MBX-ORDER-COUNT-10S: Orders placed in the current 10-second window.
Monitor these values to decelerate requests before hitting the wall.
Consequences of Overuse
- 429 Error: A temporary ban. Stop all requests and wait a few seconds.
- 418 Error: Triggered by repeatedly ignoring 429 errors. This results in an IP ban that can last 24 hours or even days. Never brute-force these errors.
Graceful Backoff
import time
while True:
try:
result = api_call()
weight_used = int(headers.get('X-MBX-USED-WEIGHT-1M', 0))
if weight_used > 1000:
time.sleep(10) # Slow down
return result
except RateLimitError:
time.sleep(60) # Wait for a full minute
Tips to Reduce Weight Consumption
1. Reduce Depth Levels
If you only need 5 levels of the order book, don't pull 1,000 levels. There is a 50x difference in weight.
2. Switch to WebSockets
Prices, order books, and K-lines can be streamed in real-time via WebSockets (WS). WS streams do not consume weight.
3. Use Batch Endpoints
The /api/v3/batchOrders endpoint allows you to place 5 orders at once for a lower weight cost than 5 individual calls.
4. Implement Caching
Cache data that doesn't need to be updated every single second.
WebSocket Advantages
| Data Type | REST Weight | WebSocket |
|---|---|---|
| Market Price | 1 per call | Streamed, 0 weight |
| Order Book | 1–50 per call | Streamed, 0 weight |
| K-lines | 1 per call | Streamed, 0 weight |
Best Practice Strategy:
- Use WebSockets for real-time data flow (0 weight consumption).
- Use REST only for placing and cancelling orders.
Decentralization via Multiple Accounts
Using multiple API Keys across different accounts naturally spreads your weight consumption.
Strategy:
- Use the Main account Key for order placement.
- Use Sub-account Keys for market queries.
- This effectively multiplies your available weight pool.
Boosting Weight with VIP Levels
Higher VIP levels offer:
- Increased per-minute weight limits.
- Higher limits for specific endpoints.
- Benefits that can be even more valuable than fee discounts.
Even VIP 1 doubles your weight limit, which is highly beneficial for high-frequency strategies.
Monitoring Code Snippet
class RateMonitor:
def __init__(self):
self.last_weight = 0
self.alert_threshold = 1000
def update(self, headers):
used = int(headers.get('X-MBX-USED-WEIGHT-1M', 0))
if used > self.alert_threshold:
print(f"WARN: {used} / 1200 used")
self.last_weight = used
Common Mistakes
1. Using Only REST in a Loop
Placing orders while checking balances and K-lines every second via REST will cause your weight usage to skyrocket. Switch to WS for streaming.
2. Failing to Handle Exceptions
Sending requests immediately after a 429 error instead of sleeping will trigger a 418 permanent ban.
3. Multiple Keys on a Single IP
While Keys are separate, IP-based rate limits still exist. Multiple Keys on one IP may be counted together for certain limits.
4. Ignoring Weight Headers
Running requests blindly until you hit the limit is a recipe for disaster. Monitor the headers.
Pre-Live Testing
- Run high-frequency tests on the testnet to see if you trigger 429 errors, then adjust.
- Set up monitoring alerts for your live environment to warn you when weight usage approaches the limit so you can decelerate automatically.
FAQ
Q: How long does a 429 ban last? A: Usually a few seconds to a few minutes. Do not send any requests during this time.
Q: How long does a 418 ban last? A: Anywhere from 1 hour to 24 hours. In extreme cases, it can last 7 days.
Q: Can I appeal a ban? A: You can contact support, but it's rarely effective. You simply have to wait it out.
Q: Is the weight shared across all endpoints? A: Yes. All weight used across all endpoints adds up to your per-minute limit (e.g., 1,200/min).
Q: Is the Order limit per IP or per account? A: It is per account. Using multiple IPs for the same account will not bypass the order limit.
Further Reading
Rate limits aren't scary as long as you account for them. Monitor your weight, and you'll never face a permanent ban.