epam · adubovik · Aug 9, 2024 · Aug 9, 2024
@@ -0,0 +1,62 @@
+## Overview
+
+An example of a simple DIAL RAG application based on Langchain utilizing Chroma vector database and RetrievalQA chain.
+
+The application processes chat completion request in the following way:
+
+1. finds the last attachment in the conversation history and extracts URL from it,
+2. downloads the document from the URL,
+3. parses the document if it's a PDF or treats it as a plain text otherwise,
+4. splits the text of the document into chunks,
+5. computes the embeddings for the chunks,
+6. saves the embeddings in the local cache,
+7. run the RetrievalQA Langchain chain that consults the embeddings store and calls chat completion model to generate final answer.
+
+Upon start the Docker image exposes `openai/deployments/simple-rag/chat/completions` endpoint at port `5000`.
+
+## Configuration
+
+|Variable|Default|Description|
+|---|---|---|
+|DIAL_URL||Required. URL of the DIAL server. Used to access embeddings and chat completion models|
+|EMBEDDINGS_MODEL|text-embedding-ada-002|Embeddings model|
+|CHAT_MODEL|gpt-4|Chat completion model|
+|API_VERSION|2024-02-01|Azure OpenAI API version|
+|LANGCHAIN_DEBUG|False|Flag to enable debug logs from Langchain|
+|OPENAI_LOG||Flag that controls openai library logging. Set to `debug` to enable debug logging|
+
+## Usage
+
+The application could be tested by running it directly on your machine:
+
+```sh
+python -m venv .venv
+source .venv/bin/activate
+pip install -r requirements.txt
+python -m app
+```
+
+Then you may call the application using DIAL API key:
+
+```sh
+curl "http://localhost:5000/openai/deployments/simple-rag/chat/completions" \
+  -X POST \
+  -H "Content-Type: application:json" \
+  -H "api-key:${DIAL_API_KEY}" \
+  -d '{
+  "stream": true,
+  "messages": [
+    {
+      "role": "user",
+      "content": "Who is Miss Meyers?",
+      "custom_content": {
+        "attachments": [
+          {
+            "url": "https://en.wikipedia.org/wiki/Miss_Meyers"
+          }
+        ]
+      }
+    }
+  ]
+}'
+```
@@ -42,7 +42,7 @@ def get_env(name: str) -> str:
     chunk_size=256, chunk_overlap=0
 )
 
-embedding_store = LocalFileStore("./cache/")
+embedding_store = LocalFileStore("./~cache/")
 
 
 class CustomCallbackHandler(AsyncCallbackHandler):

@@ -1,6 +1,6 @@
 aidial-sdk>=0.10
 langchain==0.2.9
-langchain-community==0.2.7
+langchain-community==0.2.9
 langchain-openai==0.1.17
 langchain-text-splitters==0.2.2
 tiktoken==0.7.0

@@ -11,6 +11,7 @@ Upon start the Docker image exposes `openai/deployments/render-text/chat/complet
 ## Configuration
 
 The application returns the image in one of the following formats:
+
 1. Base64 encoded image
 2. URL to the image stored in the DIAL file storage. `DIAL_URL` environment variable should be set to support image uploading to the storage.