DynamoDB CheatSheet

DynamoDB is the most important service to know for the AWS Certified Developer Associate

  • DynamoDB is a fully managed NoSQL key/value and document database.
  • DynamoDB is suited for workloads with any amounts of data that require predictable read and write performance and automatic scaling from large to small and everywhere in between.
  • DynamoDB scales up and down to support whatever read and write capacity you specify per second in provisioned capacity mode or you can set it to On-Demand mode and there is little to no capacity planning. -DynamoDB can be set to support Eventually Consistent Reads (default) and Strongly Consistent Reads on a per-call basis.
  • Eventually consistent reads data is returned immediately but data can be inconsistent. Copies of data will be generally consistent in 1 second.
  • Strongly Consistent Reads will always read from the leader partition since it always has an up-to-date copy. Data will never be inconsistent but latency may be higher. Copies of data will be consistent with a guarantee of 1 second.
  • DynamoDB stores 3 copies of data on SSD drives across 3 AZs in a region.
  • DynamoDB most common datatypes are B (Binary), N (Number), S (String)
  • Tables consist of Items (rows) and items consist of Attributes (columns)
  • A Partition is when DynamoDB slices your table up into smaller chunks of data, this speeds up reads for very large tables
  • DynamoDB automatically creates Partitions for:
    • Every 10 GB of Data or
    • When you exceed RCUs (3000) or WCUs (1000) limits for a single partition
    • When DynamoDB sees a pattern of a hot partition it will split that partition in an attempt to fix the issue
  • DynamoDB will try to evenly split the RCUs and WcUs across Partitions
  • Primary keys define where and how your data will be stored in partitions
  • Primary key comes in two types:
  • Simple Primary Key (Using only a Partition Key)
  • Composite Primary Key (Using both a Partition and Sort Key)
  • Partition Key is also known as HASH
  • Sort Key is also known as RANGE
  • When creating a Simple Primary Key the Partition Key value must be unique
  • When creating a Composite Primary Key the combined Partition and Sort Key must be unique
  • When using Sort key records on the partition are logically group together in Ascending order
  • DynamoDB Global tables provide a fully managed solution for deploying multi-region, multi-master databases.
  • DynamoDB supports transactions via the TransactWriteltems and TransactGetltems API calls
  • Transactions let you query multiple tables at once and is an all-or-nothing approach (all APl calls must succeed)
  • DynamoDB Streams allows to setup a Lambda function triggered every time data is modified in a table to react to changes
  • Streams do not consume RCUs
  • Scan
    • Your table(s) should be designed in such a way that your workload primary access patterns do not use Scans.
      • Overall, scans should be needed sparingly. e.g. infrequent report.
      • Scans through all items in a table and then returns one more items through filters
      • By default returns all attributes for every item (use ProjectExpression to limit)
      • Scans are sequential, you can speed up a scan through parallel scans using Segments and Total Segments
      • Scans can be slow, especially with very large tables and can easily consume your provisioned throughput.
      • Scans are one of the most expensive ways to access data in DynamoDB.
  • Query
    • Find items based on primary key values
    • Table must have a composite key in order to be able to query
    • By default queries are Eventually Consistent (use ConsistentRead True to change Strongly Consistent)
    • By default returns all attributes for each item found by a query (use ProjectExpression to limit)
    • By default is sorted ascending (use ScanindexForward to False to reverse order to descending) DynamoDB has two capacity modes Provisioned and On-Demand.
  • You can switch between these modes once every 24 hours.
  • Provisioned Throughput Capacity is the maximum amount of capacity your application is allowed to read or write per second from a table or index
    • Provisioned is suited for predictable or steady state workloads
    • RCUs is Read Capacity Unit
    • WCUs* is Write Capacity Unit
    • You should enable Auto Scaling with Provisioned capacity mode. In this mode, you set a floor and ceiling for the capacity you wish the table to support. DynamoDB will automatically add and remove capacity to between these values on your behalf and throttle calls that go above the ceiling for too long.
    • If you go beyond your provisioned capacity, you’ll get an Exception: ProvisionedThroughputExceededException (throttling)
    • Throttling is when requests are blocked due to read or write frequency higher than set thresholds. E.g.exceeding set provisioned capacity, partitions splitting, table/index capacity mismatch.
  • On-Demand Capacity is pay per request. So you pay only for what you use.
  • On-Demand is suited for new or unpredictable workloads
    • The throughput is only limited by the default upper limits for a table (40K RCUs and 40K WCUs)
    • Throttling can occur if you exceed double your previous peak capacity (high water mark) within 30 minutes. E.8
    • If you previously peaked to a maximum of 30,000 ops/sec, you could not peak immediately to 90,000 ops/sec, but you could to 60,000 ops/sec
    • Since there is no hard limit On-Demand could become very expensive based on emerging scenarios
  • Calculating Reads (RCU)
    • A read capacity unit represents: one strongly consistent read per second, or two eventually consistent reads per
    • second, for an item up to 4 KB in size.
    • How to calculate RCUs for strong
      • Round data up to nearest 4.
      • Divide data by 4
      • Times by number of reads
    • How to calculate RCUs for eventual
      • Round data up to nearest 4.
      • Divide data by 4
      • Times by number of reads
      • Divide final number by 2

50 reads at 40kb per item. (40/4) x 50 = 500 RCUs

10 reads at 6kb per item (8/4) x 10 = 20 RCUs

33 reads at 17kb per item (20/4) x 33 = 165 RCUs

  • Calculating Writes (Writes)

    • A write capacity unit represents: one write per second, for an item up to 1 KB
    • How to calculate Writes
      • Round data up to nearest 1
      • Times by number of writes
  • DynamoDB Accelerator (DAX) is a fully managed in-memory write through cache for DynamoDB that runs in a cluster

  • Reads are eventually consistent

  • Incoming requests are evenly distributed across all of the nodes in the cluster.

  • DAX is can reduce read response times to microseconds

  • DAX is ideal for:

    • fastest response times possible
    • apps that read a small number of items more frequently
    • apps that are read intensive
  • DAX is not ideal for:

    • Apps that require strongly consistent reads
    • Apps that do not require microsecond read response times
    • Apps that are write intensive, or that do not perform much read activity
  • DynamoDB notable commands API commands using the CLI eg. aws dynamodb <command>

    • get-item returns a set of attributes for the item with the given primary key. If no matching item, then it does not return any data and there will be no Item element in the response.
    • put-item Creates a new item, or replaces an old item with a new item. f an item that has the same primary key as the new item already exists in the specified table, the new item completely replaces the existing item.
    • update-item Edits an existing item’s attributes, or adds a new item to the table if it does not already exist.
    • batch-get-item returns the attributes of one or more items from one or more tables. You identify requested items by primary key. A single operation can retrieve up to 16 MB of data, which can contain as many as 100 items.
    • batch-write-item puts or deletes multiple items in one or more tables. can write up to 16 MB of data, which can comprise as many as 25 put or delete requests. Individual items to be written can be as large as 400 KB.
    • create-table adds a new table to your account, table names must be unique within each Region
    • update-table Modifies the provisioned throughput settings, global secondary indexes, or DynamoDB Streams settings for a given table.
    • delete-table operation deletes a table and all of its items
    • transact-get-items is a synchronous operation that atomically retrieves multiple items from one or more tables (but not from indexes) in a single account and Region. Call can contain up to 25 objects. The aggregate size of the items in the transaction cannot exceed 4 MB.
    • transact-write-items a synchronous write operation that groups up to 25 action requests. These actions can target items in different tables, but not in different AWS accounts or Regions, and no two actions can target the same item.
    • query - finds items based on primary key values. You can query table or secondary index that has a composite primary Key