Get Anthropic Claude Amazon Bedrock token counts
Last year, I wrote a post showcasing how to tokenize words using the Anthropic SDK to track token usage.
While this method continues to work, it’s not necessary since the Amazon Bedrock team added token metrics to the SDK response metadata.
In this guide, we’ll explore the SDK and how to get those token counts.
Install the AWS SDK
To begin, install boto3
, the AWS SDK for Python, using pip
:
1pip install boto3
List model access
The invoke_model
call requires a model id. To determine which models are accessible in your AWS account, run the following command using the aws
cli:
1aws bedrock list-foundation-models \
2 --by-provider anthropic \
3 --query "modelSummaries[*].modelId"
Output:
1[
2 "anthropic.claude-instant-v1:2:100k",
3 "anthropic.claude-instant-v1",
4 "anthropic.claude-v1",
5 "anthropic.claude-v2:0:18k",
6 "anthropic.claude-v2:0:100k",
7 "anthropic.claude-v2:1:18k",
8 "anthropic.claude-v2:1:200k",
9 "anthropic.claude-v2:1",
10 "anthropic.claude-v2"
11]
Call Claude using Bedrock client
The AWS SDK provides a bedrock-runtime
client with an invoke_model
function. We’ll use this and the Claude inference parameters documentation to invoke it.
1import boto3
2import json
3
4bedrock = boto3.client("bedrock-runtime")
5
6params = {
7 "prompt": "\n\nHuman: Who are you?\n\nAssistant:",
8 "max_tokens_to_sample": 200,
9}
10
11response = bedrock.invoke_model(
12 body=json.dumps(params).encode(),
13 modelId="anthropic.claude-instant-v1"
14)
15
16print(response)
Here’s the output pretty-printed in json
format:
1{
2 "ResponseMetadata": {
3 "RequestId": "d72967be-90c2-4118-951e-a555455d5d7a",
4 "HTTPStatusCode": 200,
5 "HTTPHeaders": {
6 "date": "Sat, 10 Feb 2024 05:55:23 GMT",
7 "content-type": "application/json",
8 "content-length": "117",
9 "connection": "keep-alive",
10 "x-amzn-requestid": "d72967be-90c2-4118-951e-a555455d5d7a",
11 "x-amzn-bedrock-invocation-latency": "430",
12 "x-amzn-bedrock-output-token-count": "17",
13 "x-amzn-bedrock-input-token-count": "13"
14 },
15 "RetryAttempts": 0
16 },
17 "contentType": "application/json",
18 "body": "..."
19}
Notice the information it provides us in the ResponseMetadata.HTTPHeaders
?
Count the Anthropic Claude token input and output
We can modify the code snippet above to reveal the model’s output and token usage:
1import boto3
2import json
3
4bedrock = boto3.client("bedrock-runtime")
5
6params = {
7 "prompt": "\n\nHuman: Who are you?\n\nAssistant:",
8 "max_tokens_to_sample": 200,
9}
10
11response = bedrock.invoke_model(
12 body=json.dumps(params).encode(), modelId="anthropic.claude-instant-v1"
13)
14
15body = response["body"].read()
16data = json.loads(body)
17completion = data.get("completion")
18
19input_token_count = response["ResponseMetadata"]["HTTPHeaders"][
20 "x-amzn-bedrock-input-token-count"
21]
22
23output_token_count = response["ResponseMetadata"]["HTTPHeaders"][
24 "x-amzn-bedrock-output-token-count"
25]
26
27print(completion, input_token_count, output_token_count, sep="\n")
Output:
1 My name is Claude. I'm an AI assistant created by Anthropic.
213
320
That was easy!