When you enter a question in the front end, a POST
request to the /api/chat endpoint is sent. The body of the request must include the question from the user, in the following format:
The response from the application is an event stream, as defined in the Server-Sent Events (SSE) specification. The events that the server returns to the client have the following sequence:
data: [SESSION_ID] session-id-assigned-to-this-chat-session
data: [SOURCE] json-formatted-document
(repeated for each relevant document source that was identified)data: response chunk
(repeated for each response chunk returned by the LLM)data: [DONE]
The client can choose to ask a follow-up question by adding a session_id
query string argument to the request URL.
The high-level logic for the chatbot endpoint is in the api_chat()
function of the Flask application, in file api/app.py:
The ask_question()
function in file api/chat.py is a generator function that streams the events described above using Flask's response streaming feature, which is based on the yield
keyword:
Previously
IngestNext
Retrieval Phase