Migrating to 8.0

edit

The client has major changes that require changes to how you use the client. Below outlines all the changes you’ll have to take into account when upgrading from 7.x to 8.0.

Enable compatibility mode and upgrade Elasticsearch

edit

Upgrade your Elasticsearch client to 7.16:

$ python -m pip install --upgrade 'elasticsearch>=7.16,<8'

If you have an existing application enable the compatibility mode by setting ELASTIC_CLIENT_APIVERSIONING=1 environment variable. This will instruct the Elasticsearch server to accept and respond with 7.x-compatibile requests and responses.

After you’ve done this you can upgrade the Elasticsearch server to 8.0.0.

Upgrading the client

edit

After you’ve deployed your application with the 7.16 client and using an 8.0.0 Elasticsearch server you can upgrade your client to be 8.0.

$ python -m pip install --upgrade 'elasticsearch>=8,<9'

Removing deprecation warnings

edit

You’ll likely notice after upgrading the client to 8.0 your code is either raising errors or DeprecationWarning to signal where you need to change your code before using the 8.0 client.

Strict client configuration

edit

Previously the client would use scheme="http", host="localhost", and port=9200 defaults when specifying which node(s) to connect to. Starting in 8.0 these defaults have been removed and instead require explicit configuration of scheme, host, and port or to be configured using cloud_id to avoid confusion about which Elasticsearch instance is being connected to.

This choice was made because starting 8.0.0 Elasticsearch enables HTTPS is by default, so it’s no longer a good assumption that http://localhost:9200 is the locally running cluster.

See documentation on connecting to Elasticsearch and configuring HTTPS.

For quick examples, using a configuration like one of the two below works best:

from elasticsearch import Elasticsearch

# If you're connecting to an instance on Elastic Cloud:
client = Elasticsearch(
    cloud_id="cluster-1:dXMa5Fx...",

    # Include your authentication like 'api_key'
    # 'basic_auth', or 'bearer_auth' here.
    basic_auth=("elastic", "<password>")
)

# If you're connecting to an instance hosted elsewhere:
client = Elasticsearch(
    # Notice that the scheme (https://) host (localhost),
    # and port (9200) are explicit here:
    "http://localhost:9200",

    # Include your authentication like 'api_key'
    # 'basic_auth', or 'bearer_auth' here:
    api_key=("<name>", "<key>")
)

Keyword-only arguments for APIs

edit

APIs used to support both positional and keyword arguments, however using keyword-only arguments was always recommended in the documentation. Starting in 7.14 using positional arguments would raise a DeprecationWarning but would still work.

Now starting in 8.0 keyword-only arguments are now required for APIs for better forwards-compatibility with new API options. When attempting to use positional arguments a TypeError will be raised.

# 8.0+ SUPPORTED USAGE:
client.indices.get(index="*")

# 7.x UNSUPPORTED USAGE (Don't do this!):
client.indices.get("*")

Start using .options() for transport parameters

edit

Previously some per-request options like api_key and ignore were allowed within client API methods. Starting in 8.0 this is deprecated for all APIs and for a small number of APIs may break in unexpected ways if not changed.

The parameters headers, api_key, http_auth, opaque_id, request_timeout, and ignore are effected:

from elasticsearch import Elasticsearch

client = Elasticsearch("http://localhost:9200")

# 8.0+ SUPPORTED USAGE:
client.options(api_key=("id", "api_key")).search(index="blogs")

# 7.x DEPRECATED USAGE (Don't do this!):
client.search(index="blogs", api_key=("id", "api_key"))

Some of these parameters have been renamed to be more readable and to fit other APIs. ignore should be ignore_status and http_auth should be basic_auth:

# 8.0+ SUPPORTED USAGES:
client.options(basic_auth=("username", "password")).search(...)
client.options(ignore_status=404).indices.delete(index=...)

# 7.x DEPRECATED USAGES (Don't do this!):
client.search(http_auth=("username", "password"), ...)
client.indices.delete(index=..., ignore=404)

APIs where this change is breaking and doesn’t have a deprecation period due to conflicts between the client API and Elasticsearch’s API:

  • sql.query using request_timeout
  • security.grant_api_key using api_key
  • render_search_template using params
  • search_template using params

You should immediately evaluate the usage of these parameters and start using .options(...) to avoid unexpected behavior. Below is an example of migrating away from using per-request api_key with the security.grant_api_key API:

# 8.0+ SUPPORTED USAGES:
resp = (
    client.options(
        # This is the API key being used for the request
        api_key=("request-id", "request-api-key")
    ).security.grant_api_key(
        # This is the API key being granted
        api_key={
            "name": "granted-api-key"
        },
        grant_type="password",
        username="elastic",
        password="changeme"
    )
)

# 7.x DEPRECATED USAGES (Don't do this!):
resp = (
    # This is the API key being used for the request
    client.security.grant_api_key(
        api_key=("request-id", "request-api-key"),
        # This is the API key being granted
        body={
            "api_key": {
                "name": "granted-api-key"
            },
            "grant_type": "password",
            "username": "elastic",
            "password": "changeme"
        }
    )
)

Changes to API responses

edit

In 7.x and earlier the return type for API methods were the raw deserialized response body. This meant that there was no way to access HTTP status codes, headers, or other information from the transport layer.

In 8.0.0 responses are no longer the raw deserialized response body and instead an object with two properties, meta and body. Transport layer metadata about the response like HTTP status, headers, version, and which node serviced the request are available here:

>>> resp = client.search(...)

# Response is not longer a 'dict'
>>> resp
ObjectApiResponse({'took': 1, 'timed_out': False, ...})

# But can still be used like one:
>>> resp["hits"]["total"]
{'value': 5500, 'relation': 'eq'}

>>> resp.keys()
dict_keys(['took', 'timed_out', '_shards', 'hits'])

# HTTP status
>>> resp.meta.status
200

# HTTP headers
>>> resp.meta.headers['content-type']
'application/json'

# HTTP version
>>> resp.meta.http_version
'1.1'

Because the response is no longer a dictionary, list, str, or bytes instance calling isintance() on the response object will return False. If you need direct access to the underlying deserialized response body you can use the body property:

>>> resp.body
{'took': 1, 'timed_out': False, ...}

# The response isn't a dict, but resp.body is.
>>> isinstance(resp, dict)
False

>>> isinstance(resp.body, dict)
True

Requests that used the HEAD HTTP method can still be used within if conditions but won’t work with is.

>>> resp = client.indices.exists(index=...)
>>> resp.body
True

>>> resp is True
False

>>> resp.body is True
True

>>> isinstance(resp, bool)
False

>>> isinstance(resp.body, bool)
True

Changes to error classes

edit

Previously elasticsearch.TransportError was the base class for both transport layer errors (like timeouts, connection errors) and API layer errors (like "404 Not Found" when accessing an index). This was pretty confusing when you wanted to capture API errors to inspect them for a response body and not capture errors from the transport layer.

Now in 8.0 elasticsearch.TransportError is a redefinition of elastic_transport.TransportError and will only be the base class for true transport layer errors. If you instead want to capture API layer errors you can use the new elasticsearch.ApiError base class.

from elasticsearch import TransportError, Elasticsearch

try:
    client.indices.get(index="index-that-does-not-exist")

# In elasticsearch-python v7.x this would capture the resulting
# 'NotFoundError' that would be raised above. But in 8.0.0 this
# 'except TransportError' won't capture 'NotFoundError'.
except TransportError as err:
    print(f"TransportError: {err}")

The elasticsearch.ElasticsearchException base class has been removed as well. If you’d like to capture all errors that can be raised from the library you can capture both elasticsearch.ApiError and elasticsearch.TransportError:

from elasticsearch import TransportError, ApiError, Elasticsearch

try:
    client.search(...)
# This is the 'except' clause you should use if you *actually* want to
# capture both Transport errors and API errors in one clause:
except (ApiError, TransportError) as err:
    ...

# However I recommend you instead split each error into their own 'except'
# clause so you can have different behavior for TransportErrors. This
# construction wasn't possible in 7.x and earlier.
try:
    client.search(...)
except ApiError as err:
    ... # API errors handled here
except TransportError as err:
    ... # Transport errors handled here

elasticsearch.helpers.errors.BulkIndexError and elasticsearch.helpers.errors.ScanError now use Exception as a base class instead of ElasticsearchException.

Another difference between 7.x and 8.0 errors is their properties. Previously there were status_code, info, and error properties that weren’t super useful as they’d be a mix of different value types depending on what the error was and what layer it’d been raised from (transport versus API). You can inspect the error and get response metadata via meta and response via body` from an ApiError instance:

from elasticsearch import ApiError, Elasticsearch

try:
    client.indices.get(index="index-that-does-not-exist")
except ApiError as err:
    print(err.meta.status)
    # 404
    print(err.meta.headers)
    # {'content-length': '200', ...}
    print(err.body)
    # {
    #   'error': {
    #     'type': 'index_not_found_exception',
    #     'reason': 'no such index',
    #     'resource.type': 'index_or_alias',
    #     ...
    #   },
    #   'status': 404
    # }