The prompt for the LLM is now ready, so what remains is to send it over and receive a response. To connect to LLMs, the application uses Langchain's streaming support, which nicely fits with the event streaming used in this application:

. Its purpose is to return the correct LLM integration from Langchain according to the configuration. Assuming you configured OpenAI, the LLM returned is going to be an instance of the 

Generation Phase

Chatbots built with RAG can overcome some of the limitations that general-purpose conversational models such as ChatGPT have. In particular, they are able to discuss and answer questions about:

Information that is private to your organization.

Events that were not part of the training dataset, or that took place after the LLM finished training.

As an additional benefit, RAG helps to "ground" LLMs with facts, making them less likely to make up a response or "hallucinate".

The secret to achieve this is to use a two-step process to obtain an answer from the LLM:

, one or more data sources are searched for the user's query,. The relevant documents that are found in this search are retrieved. Using an 

 index for this is a great option for this, enabling you to choose between keyword, dense and sparse vector search methods, or even a hybrid combination of them.

, the user's prompt is expanded to include the documents retrieved in the first phase, with added instructions to the LLM to find the answer to the user's question in the retrieved information. The expanded prompt, including the added context for the question, is what is sent to the LLM in place of the original query.

This tutorial is structured in two main parts.

In the first part, you will learn how to run the 

 example, a complete application featuring a Python back end and a React front end.

Once you have the example application up and running, the second part of this tutorial explains the different components of the RAG implementation, to allow you to adapt the example code to your own needs.

Welcome

To follow this tutorial you will need to install the following components:

An installation of Elasticsearch, based on our hosted 

 service (which includes a free trial period), or a self-hosted service that you run on your own computer. See the 

 section above for installation instructions.

An API Key for OpenAI. You can actually use any other LLM that you favor, as long as it is supported by the 

 interpreter. Make sure it is a recent version, such as Python 3.8 or newer.

This tutorial focuses on RAG topics. To be able to modify the example application you will need basic knowledge of the following technologies:

Requirements

Project Setup

Running the Chatbot RAG Example

Python Back End

React Front End

Getting Started

Ingest

Chatbot Endpoint

Retrieval Phase

LLM Prompt

Chat History

Chat History and Follow-Up Questions

This section of the tutorial discusses the most interesting aspects of the chatbot implementation, to help you understand it and customize it.

Implementation

Chatbot Implementation

You have reached the end of the Chatbot Tutorial. Congratulations!

We hope you now are familiar with the basic components of a chatbot project and the ideas behind 

You are encouraged to take our example application, experiment with it, and make it your own.

Generation Phase

Ready to build state of the art search experiences?