DynamoDB PartiQL guide
AWS introduced a PartiQL in 2019 as a “query language for all your data” that resembles SQL-like expressions. In November 2020, it was natively integrated with DynamoDB.
In this post, we’ll explore using PartiQL with DynamoDB.
What is PartiQL
PartiQL is an open source initiative that is maintained by Amazon under the Apache 2.0 license. Its goal is to be a SQL-compatible query language for accessing relational, semi-structured, and nested data.
Many AWS services support PartiQL:
Note about DynamoDB
While PartiQL introduces SQL-like expressions for DynamoDB, DynamoDB is a NoSQL database. Optimizing your access patterns, understanding key structure, using secondary indexes, and ensuring you’re not scanning over large amounts of data is still required.
PartiQL cannot escape DynamoDB’s rigidness - a simple SELECT statement can mistakenly scan over the entire table’s data. Because of this, ensure you’re using WHERE
clauses appropriately.
Creating a table using AWS CLI
Let’s first create a table using the many-to-many relationship model from AWS’s documentation.
1aws dynamodb create-table \
2 --table-name ExampleTable \
3 --attribute-definitions \
4 AttributeName=pk,AttributeType=S \
5 AttributeName=sk,AttributeType=S \
6 --key-schema \
7 AttributeName=pk,KeyType=HASH \
8 AttributeName=sk,KeyType=RANGE \
9 --provisioned-throughput \
10 ReadCapacityUnits=5,WriteCapacityUnits=5
This creates a table with a hash key of pk
and a range key of sk
.
Inserting data using PartiQL
With our ExampleTable
in AWS, let’s insert a single row of mock data using the INSERT INTO
operator:
1INSERT INTO ExampleTable VALUE {'pk':'INVOICE#92551','sk':'INVOICE#92551','date':'2018-02-07'}
Using the AWS CLI
1aws dynamodb execute-statement \
2 --statement "INSERT INTO ExampleTable VALUE {'pk':'INVOICE#92551','sk':'INVOICE#92551','date':'2018-02-07'}"
Transactions using PartiQL
For the next bit, we’re going to use the transactions
API. DynamoDB restricts all transactions to the same operation: either read
or write
. In our case, we’re going to write all the following records:
1aws dynamodb execute-transaction \
2 --transact-statements \
3 "{\"Statement\":\"INSERT INTO ExampleTable VALUE {'pk':'INVOICE#92551','sk':'BILL#4224663','date':'2018-02-07'}\"}" \
4 "{\"Statement\":\"INSERT INTO ExampleTable VALUE {'pk':'INVOICE#92551','sk':'BILL#4224687','date':'2018-01-09'}\"}" \
5 "{\"Statement\":\"INSERT INTO ExampleTable VALUE {'pk':'INVOICE#92552','sk':'INVOICE#92552','date':'2018-03-04'}\"}" \
6 "{\"Statement\":\"INSERT INTO ExampleTable VALUE {'pk':'INVOICE#92552','sk':'BILL#4224687','date':'2018-01-09'}\"}" \
7 "{\"Statement\":\"INSERT INTO ExampleTable VALUE {'pk':'BILL#4224663','sk':'BILL#4224663','date':'2017-12-03'}\"}"
Inserting data using PartiQL with parameters
For the last insertion, we’ll use the --parameters
option. Like SQL, we can use the ?
question mark syntax to denote a parameter.
1aws dynamodb execute-statement \
2 --statement "INSERT INTO ExampleTable VALUE {'pk':?,'sk': ?,'date':?}" \
3 --parameters '[{"S":"BILL#4224687"},{"S":"BILL#4224687"},{"S":"2018-01-09"}]'
Updating data using PartiQL
PartiQL only supports updating a single item at a time. Let’s update the BILL#4224663
item with a new attribute named foo
with a value of bar
.
1aws dynamodb execute-statement \
2 --statement "UPDATE ExampleTable SET foo='bar' WHERE pk='BILL#4224663' and sk='BILL#4224663'"
The statement above creates a new attribute named foo
with a vlaue of bar
on the item with a pk
and an sk
of BILL#4224663
. Like the DynamoDB put-item
API call, we must specify the hash and range key values in the WHERE
clause.
The following error will happen if we don’t specify both:
1An error occurred (ValidationException) when calling the ExecuteStatement operation: Where clause does not contain a mandatory equality on all key attributes
Removing an attribute or field using PartiQL
PartiQL also supports removing attributes using the following syntax:
1aws dynamodb execute-statement \
2 --statement "UPDATE ExampleTable REMOVE foo WHERE pk='BILL#4224663' and sk='BILL#4224663'"
This removes the foo
field we created in the last section.
Query examples using PartiQL
As documented by AWS, the supported SELECT
syntax for PartiQL is the following:
1SELECT expression [, ...]
2FROM table[.index]
3[ WHERE condition ] [ [ORDER BY key [DESC|ASC] , ...]
YOU MUST SPECIFY PARTITION KEYS IF YOU DO NOT WANT TO PERFORM A FULL TABLE SCAN.
Using our table with its data from the prior examples, let’s select BILL#4224663
’s record:
1SELECT *
2FROM ExampleTable
3WHERE pk = 'BILL#4224663' AND sk = 'BILL#4224663'
AWS CLI:
1aws dynamodb execute-statement \
2 --statement "SELECT * FROM ExampleTable WHERE pk = 'BILL#4224663' AND sk = 'BILL#4224663'"
Output:
1{
2 "Items": [
3 {
4 "sk": {
5 "S": "BILL#4224663"
6 },
7 "date": {
8 "S": "2017-12-03"
9 },
10 "pk": {
11 "S": "BILL#4224663"
12 }
13 }
14 ]
15}
We can additionally select multiple records:
1aws dynamodb execute-statement \
2 --statement "SELECT pk, sk FROM ExampleTable WHERE pk IN ['INVOICE#92551', 'INVOICE#92552']"
PartiQL will execute two DynamoDB queries on our behalf since it’s searching for two pk
values.
Querying with begins_with in PartiQL
Similarly, we can use the DynamoDB begins_with
functionality in PartiQL.
In the example below, we query for all bills associated with INVOICE#92551
1SELECT *
2FROM ExampleTable
3where pk = 'INVOICE#92551' AND begins_with(sk, 'BILL#')
AWS CLI:
1aws dynamodb execute-statement \
2 --statement "SELECT * FROM ExampleTable where pk = 'INVOICE#92551' AND begins_with(sk, 'BILL#')"
Output:
1{
2 "Items": [
3 {
4 "sk": {
5 "S": "BILL#4224663"
6 },
7 "date": {
8 "S": "2018-02-07"
9 },
10 "pk": {
11 "S": "INVOICE#92551"
12 }
13 },
14 {
15 "sk": {
16 "S": "BILL#4224687"
17 },
18 "date": {
19 "S": "2018-01-09"
20 },
21 "pk": {
22 "S": "INVOICE#92551"
23 }
24 }
25 ]
26}
Protecting against scans using PartiQL
PartiQL, while powerful, can be dangerous.
Let’s try to find items with a date
attribute that is greater than 2018-01-09
.
1SELECT *
2FROM ExampleTable
3WHERE "date" > '2018-01-09'
AWS CLI:
1aws dynamodb execute-statement \
2 --statement "SELECT * FROM ExampleTable WHERE \"date\" > '2018-01-09'"
Output:
1{
2 "Items": [
3 {
4 "sk": {
5 "S": "BILL#4224663"
6 },
7 "date": {
8 "S": "2018-02-07"
9 },
10 "pk": {
11 "S": "INVOICE#92551"
12 }
13 },
14 {
15 "sk": {
16 "S": "INVOICE#92551"
17 },
18 "date": {
19 "S": "2018-02-07"
20 },
21 "pk": {
22 "S": "INVOICE#92551"
23 }
24 },
25 {
26 "sk": {
27 "S": "INVOICE#92552"
28 },
29 "date": {
30 "S": "2018-03-04"
31 },
32 "pk": {
33 "S": "INVOICE#92552"
34 }
35 }
36 ]
37}
A seemingly harmless query just completed an entire table scan because it used a non-key attribute in the WHERE
clause.
To protect against these kinds of queries, AWS recommends using a DENY
IAM policy:
1{
2 "Version":"2012-10-17",
3 "Statement":[
4 {
5 "Effect":"Deny",
6 "Action":[
7 "dynamodb:PartiQLSelect"
8 ],
9 "Resource":[
10 "arn:aws:dynamodb:us-east-1:123456789012:table/ExampleTable"
11 ],
12 "Condition":{
13 "Bool":{
14 "dynamodb:FullTableScan":[
15 "true"
16 ]
17 }
18 }
19 },
20 {
21 "Effect":"Allow",
22 "Action":[
23 "dynamodb:PartiQLSelect"
24 ],
25 "Resource":[
26 "arn:aws:dynamodb:us-east-1:123456789012:table/ExampleTable"
27 ]
28 }
29 ]
30}
Deleting data using PartiQL
Like updating, PartiQL can only delete 1 item from a table at a time.
We must specify all the keys associated with an item:
1aws dynamodb execute-statement \
2 --statement "DELETE FROM ExampleTable WHERE pk='INVOICE#92551' AND sk='BILL#4224663'"
Conclusion
In this post, we went over how to insert, update, query, and delete data from DynamoDB using PartiQL. PartiQL is a fun way to query DynamoDB, but, as we discussed, can easily turn into a DynamoDB killer with full table scans.