how.wtf

Get Anthropic Claude Amazon Bedrock token counts

· Thomas Taylor

Last year, I wrote a post showcasing how to tokenize words using the Anthropic SDK to track token usage.

While this method continues to work, it’s not necessary since the Amazon Bedrock team added token metrics to the SDK response metadata.

In this guide, we’ll explore the SDK and how to get those token counts.

Install the AWS SDK

To begin, install boto3, the AWS SDK for Python, using pip:

1pip install boto3

List model access

The invoke_model call requires a model id. To determine which models are accessible in your AWS account, run the following command using the aws cli:

1aws bedrock list-foundation-models \
2    --by-provider anthropic \
3    --query "modelSummaries[*].modelId"

Output:

 1[
 2    "anthropic.claude-instant-v1:2:100k",
 3    "anthropic.claude-instant-v1",
 4    "anthropic.claude-v1",
 5    "anthropic.claude-v2:0:18k",
 6    "anthropic.claude-v2:0:100k",
 7    "anthropic.claude-v2:1:18k",
 8    "anthropic.claude-v2:1:200k",
 9    "anthropic.claude-v2:1",
10    "anthropic.claude-v2"
11]

Call Claude using Bedrock client

The AWS SDK provides a bedrock-runtime client with an invoke_model function. We’ll use this and the Claude inference parameters documentation to invoke it.

 1import boto3
 2import json
 3
 4bedrock = boto3.client("bedrock-runtime")
 5
 6params = {
 7    "prompt": "\n\nHuman: Who are you?\n\nAssistant:",
 8    "max_tokens_to_sample": 200,
 9}
10
11response = bedrock.invoke_model(
12    body=json.dumps(params).encode(),
13    modelId="anthropic.claude-instant-v1"
14)
15
16print(response)

Here’s the output pretty-printed in json format:

 1{
 2  "ResponseMetadata": {
 3    "RequestId": "d72967be-90c2-4118-951e-a555455d5d7a",
 4    "HTTPStatusCode": 200,
 5    "HTTPHeaders": {
 6      "date": "Sat, 10 Feb 2024 05:55:23 GMT",
 7      "content-type": "application/json",
 8      "content-length": "117",
 9      "connection": "keep-alive",
10      "x-amzn-requestid": "d72967be-90c2-4118-951e-a555455d5d7a",
11      "x-amzn-bedrock-invocation-latency": "430",
12      "x-amzn-bedrock-output-token-count": "17",
13      "x-amzn-bedrock-input-token-count": "13"
14    },
15    "RetryAttempts": 0
16  },
17  "contentType": "application/json",
18  "body": "..."
19}

Notice the information it provides us in the ResponseMetadata.HTTPHeaders?

Count the Anthropic Claude token input and output

We can modify the code snippet above to reveal the model’s output and token usage:

 1import boto3
 2import json
 3
 4bedrock = boto3.client("bedrock-runtime")
 5
 6params = {
 7    "prompt": "\n\nHuman: Who are you?\n\nAssistant:",
 8    "max_tokens_to_sample": 200,
 9}
10
11response = bedrock.invoke_model(
12    body=json.dumps(params).encode(), modelId="anthropic.claude-instant-v1"
13)
14
15body = response["body"].read()
16data = json.loads(body)
17completion = data.get("completion")
18
19input_token_count = response["ResponseMetadata"]["HTTPHeaders"][
20    "x-amzn-bedrock-input-token-count"
21]
22
23output_token_count = response["ResponseMetadata"]["HTTPHeaders"][
24    "x-amzn-bedrock-output-token-count"
25]
26
27print(completion, input_token_count, output_token_count, sep="\n")

Output:

1 My name is Claude. I'm an AI assistant created by Anthropic.
213
320

That was easy!

#generative-ai   #python  

Reply to this post by email ↪