how.wtf

DynamoDB PartiQL guide

· Thomas Taylor

DynamoDB and PartiQL logos in a simple architecture image

AWS introduced a PartiQL in 2019 as a “query language for all your data” that resembles SQL-like expressions. In November 2020, it was natively integrated with DynamoDB.

In this post, we’ll explore using PartiQL with DynamoDB.

What is PartiQL

PartiQL is an open source initiative that is maintained by Amazon under the Apache 2.0 license. Its goal is to be a SQL-compatible query language for accessing relational, semi-structured, and nested data.

Many AWS services support PartiQL:

Note about DynamoDB

While PartiQL introduces SQL-like expressions for DynamoDB, DynamoDB is a NoSQL database. Optimizing your access patterns, understanding key structure, using secondary indexes, and ensuring you’re not scanning over large amounts of data is still required.

PartiQL cannot escape DynamoDB’s rigidness - a simple SELECT statement can mistakenly scan over the entire table’s data. Because of this, ensure you’re using WHERE clauses appropriately.

Creating a table using AWS CLI

Let’s first create a table using the many-to-many relationship model from AWS’s documentation.

 1aws dynamodb create-table \
 2    --table-name ExampleTable \
 3    --attribute-definitions \
 4        AttributeName=pk,AttributeType=S \
 5        AttributeName=sk,AttributeType=S \
 6    --key-schema \
 7        AttributeName=pk,KeyType=HASH \
 8        AttributeName=sk,KeyType=RANGE \
 9    --provisioned-throughput \
10        ReadCapacityUnits=5,WriteCapacityUnits=5

This creates a table with a hash key of pk and a range key of sk.

Inserting data using PartiQL

With our ExampleTable in AWS, let’s insert a single row of mock data using the INSERT INTO operator:

1INSERT INTO ExampleTable VALUE {'pk':'INVOICE#92551','sk':'INVOICE#92551','date':'2018-02-07'}

Using the AWS CLI

1aws dynamodb execute-statement \
2    --statement "INSERT INTO ExampleTable VALUE {'pk':'INVOICE#92551','sk':'INVOICE#92551','date':'2018-02-07'}"

Transactions using PartiQL

For the next bit, we’re going to use the transactions API. DynamoDB restricts all transactions to the same operation: either read or write. In our case, we’re going to write all the following records:

1aws dynamodb execute-transaction \
2    --transact-statements \
3        "{\"Statement\":\"INSERT INTO ExampleTable VALUE {'pk':'INVOICE#92551','sk':'BILL#4224663','date':'2018-02-07'}\"}" \
4        "{\"Statement\":\"INSERT INTO ExampleTable VALUE {'pk':'INVOICE#92551','sk':'BILL#4224687','date':'2018-01-09'}\"}" \
5        "{\"Statement\":\"INSERT INTO ExampleTable VALUE {'pk':'INVOICE#92552','sk':'INVOICE#92552','date':'2018-03-04'}\"}" \
6        "{\"Statement\":\"INSERT INTO ExampleTable VALUE {'pk':'INVOICE#92552','sk':'BILL#4224687','date':'2018-01-09'}\"}" \
7        "{\"Statement\":\"INSERT INTO ExampleTable VALUE {'pk':'BILL#4224663','sk':'BILL#4224663','date':'2017-12-03'}\"}"

Inserting data using PartiQL with parameters

For the last insertion, we’ll use the --parameters option. Like SQL, we can use the ? question mark syntax to denote a parameter.

1aws dynamodb execute-statement \
2    --statement "INSERT INTO ExampleTable VALUE {'pk':?,'sk': ?,'date':?}" \
3    --parameters '[{"S":"BILL#4224687"},{"S":"BILL#4224687"},{"S":"2018-01-09"}]'

Updating data using PartiQL

PartiQL only supports updating a single item at a time. Let’s update the BILL#4224663 item with a new attribute named foo with a value of bar.

1aws dynamodb execute-statement \
2    --statement "UPDATE ExampleTable SET foo='bar' WHERE pk='BILL#4224663' and sk='BILL#4224663'"

The statement above creates a new attribute named foo with a vlaue of bar on the item with a pk and an sk of BILL#4224663. Like the DynamoDB put-item API call, we must specify the hash and range key values in the WHERE clause.

The following error will happen if we don’t specify both:

1An error occurred (ValidationException) when calling the ExecuteStatement operation: Where clause does not contain a mandatory equality on all key attributes

Removing an attribute or field using PartiQL

PartiQL also supports removing attributes using the following syntax:

1aws dynamodb execute-statement \
2    --statement "UPDATE ExampleTable REMOVE foo WHERE pk='BILL#4224663' and sk='BILL#4224663'"

This removes the foo field we created in the last section.

Query examples using PartiQL

As documented by AWS, the supported SELECT syntax for PartiQL is the following:

1SELECT expression  [, ...]
2FROM table[.index]
3[ WHERE condition ] [ [ORDER BY key [DESC|ASC] , ...]

YOU MUST SPECIFY PARTITION KEYS IF YOU DO NOT WANT TO PERFORM A FULL TABLE SCAN.

Using our table with its data from the prior examples, let’s select BILL#4224663’s record:

1SELECT *
2FROM ExampleTable
3WHERE pk = 'BILL#4224663' AND sk = 'BILL#4224663'

AWS CLI:

1aws dynamodb execute-statement \
2    --statement "SELECT * FROM ExampleTable WHERE pk = 'BILL#4224663' AND sk = 'BILL#4224663'"

Output:

 1{
 2    "Items": [
 3        {
 4            "sk": {
 5                "S": "BILL#4224663"
 6            },
 7            "date": {
 8                "S": "2017-12-03"
 9            },
10            "pk": {
11                "S": "BILL#4224663"
12            }
13        }
14    ]
15}

We can additionally select multiple records:

1aws dynamodb execute-statement \
2    --statement "SELECT pk, sk FROM ExampleTable WHERE pk IN ['INVOICE#92551', 'INVOICE#92552']"

PartiQL will execute two DynamoDB queries on our behalf since it’s searching for two pk values.

Querying with begins_with in PartiQL

Similarly, we can use the DynamoDB begins_with functionality in PartiQL.

In the example below, we query for all bills associated with INVOICE#92551

1SELECT *
2FROM ExampleTable
3where pk = 'INVOICE#92551' AND begins_with(sk, 'BILL#')

AWS CLI:

1aws dynamodb execute-statement \
2    --statement "SELECT * FROM ExampleTable where pk = 'INVOICE#92551' AND begins_with(sk, 'BILL#')"

Output:

 1{
 2    "Items": [
 3        {
 4            "sk": {
 5                "S": "BILL#4224663"
 6            },
 7            "date": {
 8                "S": "2018-02-07"
 9            },
10            "pk": {
11                "S": "INVOICE#92551"
12            }
13        },
14        {
15            "sk": {
16                "S": "BILL#4224687"
17            },
18            "date": {
19                "S": "2018-01-09"
20            },
21            "pk": {
22                "S": "INVOICE#92551"
23            }
24        }
25    ]
26}

Protecting against scans using PartiQL

PartiQL, while powerful, can be dangerous.

Let’s try to find items with a date attribute that is greater than 2018-01-09.

1SELECT *
2FROM ExampleTable
3WHERE "date" > '2018-01-09'

AWS CLI:

1aws dynamodb execute-statement \
2    --statement "SELECT * FROM ExampleTable WHERE \"date\" > '2018-01-09'"

Output:

 1{
 2    "Items": [
 3        {
 4            "sk": {
 5                "S": "BILL#4224663"
 6            },
 7            "date": {
 8                "S": "2018-02-07"
 9            },
10            "pk": {
11                "S": "INVOICE#92551"
12            }
13        },
14        {
15            "sk": {
16                "S": "INVOICE#92551"
17            },
18            "date": {
19                "S": "2018-02-07"
20            },
21            "pk": {
22                "S": "INVOICE#92551"
23            }
24        },
25        {
26            "sk": {
27                "S": "INVOICE#92552"
28            },
29            "date": {
30                "S": "2018-03-04"
31            },
32            "pk": {
33                "S": "INVOICE#92552"
34            }
35        }
36    ]
37}

A seemingly harmless query just completed an entire table scan because it used a non-key attribute in the WHERE clause.

To protect against these kinds of queries, AWS recommends using a DENY IAM policy:

 1{
 2   "Version":"2012-10-17",
 3   "Statement":[
 4      {
 5         "Effect":"Deny",
 6         "Action":[
 7            "dynamodb:PartiQLSelect"
 8         ],
 9         "Resource":[
10            "arn:aws:dynamodb:us-east-1:123456789012:table/ExampleTable"
11         ],
12         "Condition":{
13            "Bool":{
14               "dynamodb:FullTableScan":[
15                  "true"
16               ]
17            }
18         }
19      },
20      {
21         "Effect":"Allow",
22         "Action":[
23            "dynamodb:PartiQLSelect"
24         ],
25         "Resource":[
26            "arn:aws:dynamodb:us-east-1:123456789012:table/ExampleTable"
27         ]
28      }
29   ]
30}

Deleting data using PartiQL

Like updating, PartiQL can only delete 1 item from a table at a time.

We must specify all the keys associated with an item:

1aws dynamodb execute-statement \
2    --statement "DELETE FROM ExampleTable WHERE pk='INVOICE#92551' AND sk='BILL#4224663'"

Conclusion

In this post, we went over how to insert, update, query, and delete data from DynamoDB using PartiQL. PartiQL is a fun way to query DynamoDB, but, as we discussed, can easily turn into a DynamoDB killer with full table scans.

#aws   #dynamodb   #serverless  

Reply to this post by email ↪