Implement version control in DynamoDB
Amazon DynamoDB is a fully managed service provided by AWS that enables developers to quickly store data for their applications. In this article, I will showcase how to implement version control in DynamoDB for recording changes to data over time.
What is version control in DynamoDB
DynamoDB does not support native version control on a per-item basis. If you need to record changes to your data over time, it must be handled via the application. Luckily, there is a paradigm that supports storing multiple versions of the same data: duplication.
How to implement versioning in DynamoDB
We have a few options for storing versioned data in DynamoDB. For the purposes of this tutorial, we will use a single table design: i.e., using a primary key and a sort key to manage multiple data types.
Creating a table with versioning in DynamoDB
For the remaining sections of this tutorial, we’ll leverage the same single table.
Create a table
To begin, let’s define a table using the AWS CLI:
1aws dynamodb create-table \
2 --table-name table \
3 --attribute-definitions \
4 AttributeName=PK,AttributeType=S \
5 AttributeName=SK,AttributeType=S \
6 --key-schema \
7 AttributeName=PK,KeyType=HASH \
8 AttributeName=SK,KeyType=RANGE \
9 --provisioned-throughput \
10 ReadCapacityUnits=5,WriteCapacityUnits=5
I specified a primary key named PK
and a sort key of SK
.
Time versioning
Time-based versioning may be useful for applications that need to store the status of data at certain timed intervals.
Inserting example data
To showcase the power of DynamoDB, let’s insert some values for a file object with an identifier of 1.
1aws dynamodb put-item \
2 --table-name table \
3 --item '{"PK":{"S":"file#1"},"SK":{"S":"2024-01-13T11:25:27-05:00"}}'
and another:
1aws dynamodb put-item \
2 --table-name table \
3 --item '{"PK":{"S":"file#1"},"SK":{"S":"2024-01-13T11:32:13-05:00"}}'
and one more:
1aws dynamodb put-item \
2 --table-name table \
3 --item '{"PK":{"S":"file#1"},"SK":{"S":"2024-01-13T11:37:58-05:00"}}'
Query for latest versions
Because the timestamps are sortable, we can leverage DynamoDB to perform the following requests:
- Grab the last 100 versions
- Grab all the versions by a specific interval (year, month, day, etc.)
Grab the latest 100 versions (or up to the DynamoDB limits):
1aws dynamodb query \
2 --table-name table \
3 --key-condition-expression "PK=:pk" \
4 --expression-attribute-values '{":pk":{"S":"file#1"}}' \
5 --no-scan-index-forward
The --no-scan-index-forward
flag is important to sort the records in descending order rather than the default of ascending.
Output:
1{
2 "Items": [
3 {
4 "PK": {
5 "S": "file#1"
6 },
7 "SK": {
8 "S": "2024-01-13T11:37:58-05:00"
9 }
10 },
11 {
12 "PK": {
13 "S": "file#1"
14 },
15 "SK": {
16 "S": "2024-01-13T11:32:13-05:00"
17 }
18 },
19 {
20 "PK": {
21 "S": "file#1"
22 },
23 "SK": {
24 "S": "2024-01-13T11:25:27-05:00"
25 }
26 }
27 ],
28 "Count": 3,
29 "ScannedCount": 3,
30 "ConsumedCapacity": null
31}
Grab all versions by a specific interval:
Using the begins_with
or between
operators, we can query for specific dates.
In the case below, I want to query everything that starts with 2024-01-13T11:3
:
1aws dynamodb query \
2 --table-name table \
3 --key-condition-expression "PK=:pk and begins_with(SK, :sk)" \
4 --expression-attribute-values '{":pk":{"S":"file#1"},":sk":{"S":"2024-01-13T11:3"}}' \
5 --no-scan-index-forward
Output:
1{
2 "Items": [
3 {
4 "PK": {
5 "S": "file#1"
6 },
7 "SK": {
8 "S": "2024-01-13T11:37:58-05:00"
9 }
10 },
11 {
12 "PK": {
13 "S": "file#1"
14 },
15 "SK": {
16 "S": "2024-01-13T11:32:13-05:00"
17 }
18 }
19 ],
20 "Count": 2,
21 "ScannedCount": 2,
22 "ConsumedCapacity": null
23}
Number versioning
For applications that want to maintain a “latest” version with the ability to rollback to a prior version, a number-based versioning paradigm will be optimal.
Inserting example data
To showcase the power of DynamoDB, let’s insert some values for a file object with an identifier of 2 and 2 different versions.
The first item will be the metadata
for file#2
. This contains the attributes for the file#2
when the application needs to fetch the latest version with the appropriate values.
1aws dynamodb put-item \
2 --table-name table \
3 --item '{"PK":{"S":"file#2"},"SK":{"S":"metadata"},"version":{"S":"2"},"foo":{"S":"baz"}}'
The second item will contain version 1’s information.
1aws dynamodb put-item \
2 --table-name table \
3 --item '{"PK":{"S":"file#2"},"SK":{"S":"version#1"},"version":{"S":"1"},"foo":{"S":"bar"}}'
The third item will contain version 2’s information.
1aws dynamodb put-item \
2 --table-name table \
3 --item '{"PK":{"S":"file#2"},"SK":{"S":"version#2"},"version":{"S":"2"},"foo":{"S":"baz"}}'
For this method, we duplicate the attributes and values of version#2
onto the main metadata
object.
Query for latest versions
Let’s query for all versions:
1aws dynamodb query \
2 --table-name table \
3 --key-condition-expression "PK=:pk and begins_with(SK, :sk)" \
4 --expression-attribute-values '{":pk":{"S":"file#2"},":sk":{"S":"version#"}}' \
5 --no-scan-index-forward
Output:
1{
2 "Items": [
3 {
4 "version": {
5 "S": "2"
6 },
7 "SK": {
8 "S": "version#2"
9 },
10 "PK": {
11 "S": "file#2"
12 },
13 "foo": {
14 "S": "baz"
15 }
16 },
17 {
18 "version": {
19 "S": "1"
20 },
21 "SK": {
22 "S": "version#1"
23 },
24 "PK": {
25 "S": "file#2"
26 },
27 "foo": {
28 "S": "bar"
29 }
30 }
31 ],
32 "Count": 2,
33 "ScannedCount": 2,
34 "ConsumedCapacity": null
35}
The user decides they want version#1
to be the selected version for file#2
. To satisfy the request perform the following steps:
- Modify the
metadata
item’sversion
attribute to1
- Duplicate the version’s attributes onto the
metadata
item
1aws dynamodb put-item \
2 --table-name table \
3 --item '{"PK":{"S":"file#2"},"SK":{"S":"metadata"},"version":{"S":"1"},"foo":{"S":"bar"}}'
Next time we fetch the latest version it’ll point to version 1:
1aws dynamodb get-item \
2 --table-name table \
3 --key '{"PK":{"S":"file#2"},"SK":{"S":"metadata"}}'
Output:
1{
2 "Item": {
3 "version": {
4 "S": "1"
5 },
6 "SK": {
7 "S": "metadata"
8 },
9 "PK": {
10 "S": "file#2"
11 },
12 "foo": {
13 "S": "bar"
14 }
15 }
16}