Amazon Bedrock with images using Claude 3 Sonnet with Python

2024-03-04 · Thomas Taylor

On March 4th, 2024, the Claude 3 Sonnet foundational model was made available in Amazon Bedrock. Not only is the model a major improvement over the Claude 2.0 family, but it allows for vision input.

In the post, we’ll explore how to invoke Claude 3 Sonnet using Amazon’s boto3 library in Python.

Getting started

For those unfamiliar with the Bedrock runtime API, I have a post that demonstrates how to invoke it using Claude 2.0 and Python boto3. For this post, I’ll be focusing specifically on image invocations.

Firstly, install boto3

1pip3 install boto3

How to send images to Amazon Bedrock

Fortunately, the API is straight forward to use. Let’s instantiate the Amazon Bedrock Runtime client:

1import boto3
2
3runtime = boto3.client("bedrock-runtime")

For Claude 3 Sonnet, we are required to use the Anthropic Claude Messages API as the inference parameter input.

 1{
 2    "anthropic_version": "bedrock-2023-05-31", 
 3    "max_tokens": 1024,
 4    "messages": [
 5        {
 6            "role": "user",
 7            "content": [
 8                {
 9                    "type": "image",
10                    "source": {
11                        "type": "base64",
12                        "media_type": "image/jpeg",
13                        "data": "..."
14                    }
15                },
16                {
17                    "type": "text",
18                    "text": "What is in this image?"
19                }
20            ]
21        }
22    ]
23}

For the Claude 3 Sonnet model, the image is supplied using base64.

For my case, I will be using a local file for the model.

1with open("image.jpg", "rb") as image_file:
2    image_bytes = image_file.read()
3
4encoded_image = base64.b64encode(image_bytes).decode("utf-8")

You may choose to download image using the requests library.

1image_url = "https://url-to-the-image.com/file.jpg"
2image_bytes = requests.get(url).content
3
4encoded_image = base64.b64encode(image_bytes).decode("utf-8")

The image I’m using is the Hohenzollern Castle in Germany.

Here is the full example:

 1import base64
 2import json
 3
 4import boto3
 5
 6runtime = boto3.client("bedrock-runtime")
 7
 8with open("image.jpg", "rb") as image_file:
 9    image_bytes = image_file.read()
10
11encoded_image = base64.b64encode(image_bytes).decode("utf-8")
12
13body = json.dumps(
14    {
15        "anthropic_version": "bedrock-2023-05-31",
16        "max_tokens": 1000,
17        "messages": [
18            {
19                "role": "user",
20                "content": [
21                    {
22                        "type": "image",
23                        "source": {
24                            "type": "base64",
25                            "media_type": "image/jpeg",
26                            "data": encoded_image,
27                        },
28                    },
29                    {"type": "text", "text": "What is in this image?"},
30                ],
31            }
32        ],
33    }
34)
35
36response = runtime.invoke_model(
37    modelId="anthropic.claude-3-sonnet-20240229-v1:0",
38    body=body
39)
40
41response_body = json.loads(response.get("body").read())
42
43print(response_body)

Here’s the response_body converted to a JSON payload:

 1{
 2  "id":"msg_01315rokf5kw6LSnnW6aWyVb",
 3  "type":"message",
 4  "role":"assistant",
 5  "content":[
 6    {
 7      "type":"text",
 8      "text":"This image shows the famous Hohenzollern Castle in Germany. The castle sits majestically atop a cliff, overlooking a river or lake in the foreground. The massive medieval fortress features ornate towers, spires, and archways in a Romanesque architectural style with red tiled roofs. The castle appears to be well-preserved and maintained, set against a picturesque blue sky with fluffy white clouds. The surrounding scenery includes bare trees and a small playground area near the water's edge, suggesting this is a popular tourist destination showcasing Germany's rich historical heritage and stunning natural landscapes."
 9    }
10  ],
11  "model":"claude-3-sonnet-28k-20240229",
12  "stop_reason":"end_turn",
13  "stop_sequence":null,
14  "usage":{
15    "input_tokens":1606,
16    "output_tokens":130
17  }
18}

Sure enough, the model accurately depicted the image shown.

#Aws #Python #Generative-Ai

Reply to this post by email ↪