Create an Amazon Bedrock inference endpoint
Added in 8.12.0
Creates an inference endpoint to perform an inference task with the amazonbedrock
service.
You need to provide the access and secret keys only once, during the inference model creation. The get inference API does not retrieve your access or secret keys. After creating the inference model, you cannot change the associated key pairs. If you want to use a different access and secret key pair, delete the inference model and recreate it with the same name and the updated keys.
When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running.
After creating the endpoint, wait for the model deployment to complete before using it.
To verify the deployment status, use the get trained model statistics API.
Look for "state": "fully_allocated"
in the response and ensure that the "allocation_count"
matches the "target_allocation_count"
.
Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.
Path parameters
-
task_type
string Required The type of the inference task that the model will perform.
Values are
completion
ortext_embedding
. -
amazonbedrock_inference_id
string Required The unique identifier of the inference endpoint.
Body
-
chunking_settings
object -
service
string Required Value is
amazonbedrock
. -
service_settings
object Required -
task_settings
object
curl \
--request PUT 'http://api.example.com/_inference/{task_type}/{amazonbedrock_inference_id}' \
--header "Authorization: $API_KEY" \
--header "Content-Type: application/json" \
--data '"{\n \"service\": \"amazonbedrock\",\n \"service_settings\": {\n \"access_key\": \"AWS-access-key\",\n \"secret_key\": \"AWS-secret-key\",\n \"region\": \"us-east-1\",\n \"provider\": \"amazontitan\",\n \"model\": \"amazon.titan-embed-text-v2:0\"\n }\n}"'
{
"service": "amazonbedrock",
"service_settings": {
"access_key": "AWS-access-key",
"secret_key": "AWS-secret-key",
"region": "us-east-1",
"provider": "amazontitan",
"model": "amazon.titan-embed-text-v2:0"
}
}
{
"service": "openai",
"service_settings": {
"api_key": "OpenAI-API-Key",
"model_id": "gpt-3.5-turbo"
}
}