Google AI Studio inference service
editGoogle AI Studio inference service
editCreates an inference endpoint to perform an inference task with the googleaistudio
service.
Request
editPUT /_inference/<task_type>/<inference_id>
Path parameters
edit-
<inference_id>
- (Required, string) The unique identifier of the inference endpoint.
-
<task_type>
-
(Required, string) The type of the inference task that the model will perform.
Available task types:
-
completion
, -
text_embedding
.
-
Request body
edit-
service
-
(Required, string)
The type of service supported for the specified task type. In this case,
googleaistudio
. -
service_settings
-
(Required, object) Settings used to install the inference model.
These settings are specific to the
googleaistudio
service.-
api_key
- (Required, string) A valid API key for the Google Gemini API.
-
model_id
- (Required, string) The name of the model to use for the inference task. You can find the supported models at Gemini API models.
-
rate_limit
-
(Optional, object) By default, the
googleaistudio
service sets the number of requests allowed per minute to360
. This helps to minimize the number of rate limit errors returned from Google AI Studio. To modify this, set therequests_per_minute
setting of this object in your service settings:"rate_limit": { "requests_per_minute": <<number_of_requests>> }
-
Google AI Studio service example
editThe following example shows how to create an inference endpoint called
google_ai_studio_completion
to perform a completion
task type.
resp = client.inference.put( task_type="completion", inference_id="google_ai_studio_completion", inference_config={ "service": "googleaistudio", "service_settings": { "api_key": "<api_key>", "model_id": "<model_id>" } }, ) print(resp)
const response = await client.inference.put({ task_type: "completion", inference_id: "google_ai_studio_completion", inference_config: { service: "googleaistudio", service_settings: { api_key: "<api_key>", model_id: "<model_id>", }, }, }); console.log(response);
PUT _inference/completion/google_ai_studio_completion { "service": "googleaistudio", "service_settings": { "api_key": "<api_key>", "model_id": "<model_id>" } }