AWS DynamoDB. Access capacity and service throttling.

Yuri Fenyuk
6 min readJul 23, 2018

Most of Amazon Cloud developers are using DynamoDB tables as it is the simplest way to persist data and all IT world is about data. The learning path is very short. There is nothing needs to be specified to establish new table, just type name and partition key and you are good to go. Same as easy to read or write data to your table. In a quarter of hour with help of AWS SDK you have dozen lines of code to read and write your data. Fantastic! Time to shift attention to something else.

DynamoDB table works as expected until you upload more documents. More data is always lead to sudden exception in your code and first meet with ProvisionedThroughputExceededException. In most cases, there is no time to find a perfect solution when you can just increase Provisioned capacity via Capacity tab in AWS Console. Back on track with your plans :)

So far, so good! Let’s have couple more tables, let’s clone all tables onto staging and production versions… and next week you see billing report from Amazon. Each table costs almost nothing, just a amount of tables makes situation worse.

Time to pay more attention to Capacity tab. As appeared, read capacity is twice cheaper and since we mostly reading, let’s continue with half of write capacity. Step is in right direction.

Same tab has Auto Scaling section. That could help as well, as nobody uses your application 24/7 (in fact, it’s not released-:) ) yet. Amazon promises fully automated capacity up and down scaling, functionality is just 4clicks away. Let’s test it immediately! Yes, Auto Scaling works as expected, with only one exception that ProvisionedThroughputExceededException is still here, because scaling is quite slow in reaction as it based on AWS Cloud Watch alarms.

Time to play with AWS DynamoDB to find a better solution.

#1. Straightforward approach

Below is the js code for multiple and simultaneous DynamoDB data reads. It’s not too important what exactly to read (sample table has just one row with few columns).The idea is to provoke AWS service throttling.

You can grab and run this unit (do not worry, I will do it for you) as soon as you have AWS Subscription and any DynamoDB table.

Central part is ReadDynamoPromise function which does a simple DynamoDB read. Table ‘table-sanbox’ is located on eu-west-1 region and has read capacity 1. In Main function I ask how many times read function needs to be run, asynchronously run as many copies as needed, wait for it to finish and print simple statistics.

Execution result for 01.default-DocumentClient.js

I asked for 2000 DynamoDB gets which took 87secs to finish.1022 of them were returned, but 978 are failed (means that throttling exception occurred multiple times on Amazon side). The sum of two is 2000. Retries are particularly interesting part. AWS SDK supports nice retry logic which default setup helped to increase successful results ratio.

#2. With disabled retries

Next step is to understand how helpful is default retry logic. Below code is different in line #13 when AWS.DynamoDB instance, with maxRetries equals to zero, created. Thus, no addition retries on client side should be.

Output is below:

Execution result for 02.zero-retries.js

Now it took half of minute less (sure, Amazons retry logic built on exponential timeout values and now it is off) and returned 903 results, which is 10% less than with default retry logic. Apparently, same ‘retries’ and ‘failures’ numbers are result of client’s retry event handler invoked and immediately failed.

#3. Realistic retries

The next modification (full file can be found here) is to setup retry count to 5 and base timeout to 300ms

Execution result for 03.realistic-retries.js

The biggest improvement is time, but quality is also better. Also registered retries is much higher.

#4. Aggressive retries

Any failures are not good, so let’s tackle this issue by setting maximum retry to very high number. In addition, I am calculating time it takes to receive every 100 reads with no error.

Execution result for 04.aggressive-retries.js

In this case, price for stability (zero get failed) is a time. Twenty one minutes in total!!! Also pay attention to time spent to return each hundred. First thousand is really quick due to fact that AWS allows a short capacity boost expecting that this spike ends very quick. In this case it allowed 8 unit read capacity for a short time on start. Second block of result is roughly 30secs for each hundred, which is also not bad. Last 3 blocks are result of exponential retry timeout growing. Even a last unfinished get which returned to client 50 times is enough to keep last hundred open for record 862 seconds.

Good that there is a possibility to tune AWS retry logic more and squeeze more than in run above.

#5. Retries in code

This sample is off AWS retry logic but catch ProvisionedThroughputExceededException on client and retry get request until it returns.

On line #16 AWS retry logic is off. In line #90 I wrap ReadDynamoPromise function in InvokeProvisioned, which implemented in a way that it catches rejected promise of wrapped function (ReadDynamoPromise in this case) and if it is throttling error, start wrapped function again.

It is quite a lot of code, especially if compared with AWS retry logic which is implemented for you, but there could be some situations when developer can take a decision whether continue to retry failed function (based on failed data importance , number of fails etc). More possibility to tune retry logic, at least in theory, is a valuable tool in experienced developer’s arms.

Execution result for 05.retries-in-code.js

I am comparing results of 05.retries-in-code.js and 04.aggressive-retries.js. Eventually, and this is the most important, 100% accuracy reached for both runs. Last run ended in roughly 06:30 which is significantly less than 21 minutes. Need to remember that in both cases logic can be tweaked more and results can be different.

Now comparing hundreds. In first third, 04.aggressive-retries.js was insignificantly quicker (100ms versus 300ms), second third was almost equal (roughly 33secs). Last third brought the major difference in time.

This article does a very brief description of tweaks that can help to deal with DynamoDB service throttling and pays more attention on how to do it on client side code. Being introduced quite a long time ago, this AWS is still rapidly evolves over last couple years (DynamoDB DAX, global tables, built-in backups are just the most sparkle examples of how Amazon engineers release top features and relieve our life). Most likely, in a short time a perfect remedy will be released and make this article useless.

We will see.

Thanks for reading it!

P.S.: link to my git repo is here https://github.com/YurgenUA/articles/tree/master/throttling

--

--