API throttling limits

Throttling is the process of limiting the number of requests you (or your authorized developer) can submit to a given operation in a given amount of time. Throttling protects the web service from being overwhelmed with requests and ensures all authorized developers have access to the web service. Requests that have been throttled will return a 429 HTTP status code, see error handling for information on how to handle throttled requests.

The Leaky Bucket Algorithm
Key terms
Tips on avoiding throttling

The Leaky Bucket Algorithm

This algorithm is based on the analogy where a bucket has a hole in the bottom from which water leaks out at a constant rate. Water can be added to the bucket intermittently, but if too much water is added at once or if water is added at too high an average rate, the water will exceed the capacity of the bucket.

To apply this analogy here, imagine that the bucket represents the maximum request quota, which is the maximum number of requests you can make at one time. The hole in the bucket represents the restore rate, which is the amount of time it takes to be able to make new requests. So, if you submit too many requests at once, then the bucket overflows and, throttling occurs. If you fill up the bucket, it takes some time before you can add more water to the bucket since the water leaks from the bucket at a steady rate. So the ability to submit more requests after you have reached the maximum request quota is governed by the restore rate, the time it takes to allow you to make new requests.

Key terms

Request quota - The number of requests that you can submit at one time without throttling. The request quota decreases with each request you submit, and increases at the restore rate (measured in seconds). Requests are calculated for each Amazon seller account and Amazon MWS developer account pair.
Max quota (also called the burst rate) - The maximum size that the request quota can reach.
Restore rate (also called the recovery rate) - The rate at which your request quota increases over time measured in seconds, up to the maximum request quota.

To apply these ideas, consider this example. Imagine that you want to use the CreateCharge operation to submit 30 create charge requests. The CreateCharge operation has a request quota of 10 and a restore rate of one new request every 4 seconds. If you submit all 30 create charge requests at once, only the first 10 requests will be accepted, and the remaining 20 requests will be throttled. To handle the remaining requests efficiently, you would need to wait for the quota to restore and resubmit in batches. Since the restore rate is one request every 4 seconds, after 40 seconds you would have 10 available requests again, allowing you to submit the next 10 requests. After another 40 seconds, you could submit the final 10 requests. So, instead of submitting all requests at once and dealing with throttling, you could automate your process to submit requests incrementally.

For example, you could submit 10 requests (out of your original 30 requests), and the request quota would be depleted (0 requests remaining). You could then wait 1 minute, and the restore rate would have restored the request quota back to its maximum of 10 (since 60 seconds ÷ 4 seconds = 15 potential restorations, but the quota caps at the maximum of 10). You could then submit 10 more requests. For the remaining 10 createCharge requests, you could wait another 40 seconds to restore the quota to 10 again, then submit the final batch.

This table indicates that the maximum number of requests that can be made before getting throttled are “burst” number and the quota is restored at a rate of 1 request per “restore” seconds.

API	Live		Sandbox
	Burst	Restore (s)	Burst	Restore (s)
Cancel Charge	10	2	2	1
Capture Charge	20	4	2	1
Close Charge Permission	10	2	2	1
Complete Checkout Session	20	4	2	1
Create Charge	10	4	2	1
Create Checkout Session	40	16	5	1
Create Merchant Account	10	1	1	1
Create Refund	10	4	2	1
Create Delivery Tracker	10	1	1	1
Get Authorization Token	5	1	N/A	N/A
Get Charge	20	4	5	1
Get Charge Permission	20	4	10	1
Get Checkout Session	40	8	10	1
Get Merchant Status	10	1	1	1
Get Refund	20	4	5	1
Update Charge Permission	10	2	5	1
Update Checkout Session	20	8	5	1

API	Live		Sandbox
	Burst	Restore (s)	Burst	Restore (s)
Cancel Charge	10	2	2	1
Capture Charge	20	4	2	1
Close Charge Permission	10	2	2	1
Complete Checkout Session	20	4	2	1
Create Charge	10	4	2	1
Create Checkout Session	40	16	2	1
Create Refund	10	4	2	1
Create Delivery Tracker	10	1	1	1
Get Authorization Token	5	1	N/A	N/A
Get Charge	20	4	5	1
Get Charge Permission	20	4	5	1
Get Checkout Session	40	8	5	1
Get Refund	20	4	5	1
Update Charge Permission	10	2	2	1
Update Checkout Session	20	8	2	1

API	Live		Sandbox
	Burst	Restore (s)	Burst	Restore (s)
Cancel Charge	10	2	2	1
Capture Charge	20	4	2	1
Close Charge Permission	10	2	2	1
Complete Checkout Session	20	4	2	1
Create Charge	10	4	2	1
Create Checkout Session	40	16	2	1
Create Merchant Account	10	1	1	1
Create Refund	10	4	2	1
Create Delivery Tracker	10	1	1	1
Get Authorization Token	5	1	N/A	N/A
Get Charge	20	4	5	1
Get Charge Permission	20	4	5	1
Get Checkout Session	40	8	5	1
Get Merchant Status	10	1	1	1
Get Refund	20	4	5	1
Update Charge Permission	10	2	2	1
Update Checkout Session	20	8	2	1

Tips on avoiding throttling

There are several things you can do to make sure your requests and submissions are processed successfully:

Know the throttling limit of the specific request you are submitting.
Have a "back off" plan for automatically reducing the number of requests if the service is unavailable. The plan should use the restore rate value to determine when a request should be resubmitted.

You should also distribute your requests to maximize service availability:

Submit requests at times other than on the hour or on the half hour. For example, submit requests at 11 minutes after the hour or at 41 minutes after the hour.
Take advantage of times during the day when traffic is likely to be low, such as early evening or early morning hours.