Bahubali Shetti

Tracing Langchain apps with Elastic, OpenLLMetry, and OpenTelemetry

Langchain applications are growing in use. The ability to build out RAG-based applications, simple AI Assistants, and more is becoming the norm. Observing these applications is even harder. Given the various options that are out there, this blog shows how to use OpenTelemetry instrumentation with OpenLLMetry and ingest it into Elastic Observability APM

Tracing Langchain apps with Elastic, OpenLLMetry, and OpenTelemetry

LangChain has rapidly emerged as a crucial framework in the AI development landscape, particularly for building applications powered by large language models (LLMs). As its adoption has soared among developers, the need for effective debugging and performance optimization tools has become increasingly apparent. One such essential tool is the ability to obtain and analyze traces from LangChain applications. Tracing provides invaluable insights into the execution flow, helping developers understand and improve their AI-driven systems. 

There are several options to trace for Langchain. One is Langsmith, ideal for detailed tracing and a complete breakdown of requests to large language models (LLMs). However, it is specific to Langchain. OpenTelemetry (OTel) is now broadly accepted as the industry standard for tracing. As one of the major Cloud Native Computing Foundation (CNCF) projects, with as many commits as Kubernetes, it is gaining support from major ISVs and cloud providers delivering support for the framework. 

Hence, many Langchain-based applications will have multiple components beyond just LLM interactions. Using OpenTelemetry with Langchain is essential. OpenLLMetry is an available option for tracing Langchain apps in addition to Langsmith.

This blog will show how you can get Langchain tracing into Elastic using the OpenLLMetry library

opentelemetry-instrumentation-langchain
.

Pre-requisites:

Overview

In highlighting tracing I created a simple LangChain app that does the following:

  1. Takes customer input on the command line. (Queries)

  2. Sends these to the Azure OpenAI LLM via a Lang chain.

  3. Chain tools are set to use the search with Tavily 

  4. The LLM uses the output which returns the relevant information to the user.

As you can see Elastic Observability’s APM recognizes the LangChain App, and also shows the full trace (done with manual instrumentation):

As the above image shows:

  1. The user makes a query
  2. Azure OpenAI is called, but it uses a tool (Tavily) to obtain some results
  3. Azure OpenAI reviews and returns a summary to the end user

The code was manually instrumented, but auto-instrument can also be used.

OpenTelemetry Configuration

In using OpenTelemetry, we need to configure the SDK to generate traces and configure Elastic’s endpoint and authorization. Instructions can be found in OpenTelemetry Auto-Instrumentation setup documentation.

OpenTelemetry Environment variables:

OpenTelemetry Environment variables for Elastic can be set as follows in linux (or in the code).

OTEL_EXPORTER_OTLP_ENDPOINT=12345.apm.us-west-2.aws.cloud.es.io:443
OTEL_EXPORTER_OTLP_HEADERS="Authorization=Bearer%20ZZZZZZZ"
OTEL_RESOURCE_ATTRIBUTES="service.name=langchainChat,service.version=1.0,deployment.environment=production"

As you can see

OTEL_EXPORTER_OTLP_ENDPOINT
is set to Elastic, and the corresponding authorization header is also provided. These can be easily obtained from Elastic’s APM configuration screen under OpenTelemetry

Note: No agent is needed, we simply send the OTLP trace messages directly to Elastic’s APM server. 

OpenLLMetry Library:

OpenTelemetry's auto-instrumentation can be extended to trace other frameworks via instrumentation packages.

First, you must install the following package: 

pip install opentelemetry-instrumentation-langchain

This library was developed by OpenLLMetry. 

Then you will need to add the following to the code.

from opentelemetry.instrumentation.langchain import LangchainInstrumentor
LangchainInstrumentor().instrument()

Instrumentation

Once the libraries are added, and the environment variables are set, you can use auto-instrumentation With auto-instrumentation, the following:

opentelemetry-instrument python tavilyAzureApp.py

The OpenLLMetry library does pull out the flow correctly with minimal manual instrumentation except for adding the OpenLLMetry library.

  1. Takes customer input on the command line. (Queries)

  2. Sends these to the Azure OpenAI LLM via a Lang chain.

  3. Chain tools are set to use the search with Tavily 

  4. The LLM uses the output which returns the relevant information to the user.

Manual-instrumentation

If you want to get more details out of the application, you will need to manually instrument. To get more traces follow my Python instrumentation guide. This guide will walk you through setting up the necessary OpenTeleemtry bits, Additionally, you can also look at the documentation in OTel for instrumenting in Python.

Note that the env variables

OTEL_EXPORTER_OTLP_HEADERS
and
OTEL_EXPORTER_OTLP_ENDPOINT
are set as noted in the section above. You can also set up the
OTEL_RESOURCE_ATTRIBUTES
.

Once you follow the steps in either guide and initiate the tracer, you will have to essentially just add the span where you want to get more details. In the example below, only one line of code is added for span initialization.

Look at the placement of with

tracer.start_as_current_span("getting user query") as span:
below

# Creates a tracer from the global tracer provider
tracer = trace.get_tracer("newsQuery")

async def chat_interface():
    print("Welcome to the AI Chat Interface!")
    print("Type 'quit' to exit the chat.")
    
    with tracer.start_as_current_span("getting user query") as span:
        while True:
            user_input = input("\nYou: ").strip()
            
            if user_input.lower() == 'quit':
                print("Thank you for chatting. Goodbye!")
                break
        
            print("AI: Thinking...")
            try:
                result = await chain.ainvoke({"query": user_input})
                print(f"AI: {result.content}")
            except Exception as e:
                print(f"An error occurred: {str(e)}")


if __name__ == "__main__":
    asyncio.run(chat_interface())

As you can see, with manual instrumentation, we get the following trace:

Which calls out when we enter our query function.

async def chat_interface()

Conclusion

In this blog, we discussed the following:

  • How to manually instrument Langchain with OpenTelemetry

  • How to properly initialize OpenTelemetry and add a custom span

  • How to easily set the OTLP ENDPOINT and OTLP HEADERS with Elastic without the need for a collector

  • See traces in Elastic Observability APM

Hopefully, this provides an easy-to-understand walk-through of instrumenting Langchain with OpenTelemetry and how easy it is to send traces into Elastic.

Additional resources for OpenTelemetry with Elastic:

Also log into cloud.elastic.co to try out Elastic with a free trial.

Share this article