✨🦙 Argilla's LlamaIndex Integration

Argilla integration into the LlamaIndex workflow

Tip

To discuss, get support, or give feedback join Discord in #argilla-distilabel-general and #argilla-distilabel-help. You will be able to engage with our amazing community and the core developers of argilla and distilabel.

This integration allows the user to include the feedback loop that Argilla offers into the LlamaIndex ecosystem. It's based on a callback handler to be run within the LlamaIndex workflow.

Don't hesitate to check out both LlamaIndex and Argilla

Getting Started

You first need to install argilla-llama-index as follows:

pip install argilla-llama-index

If you already have deployed Argilla, you can skip this step. Otherwise, you can quickly deploy Argilla following this guide.

Basic Usage

To easily log your data into Argilla within your LlamaIndex workflow, you only need a simple step. Just call the Argilla global handler for Llama Index before starting production with your LLM.

dataset_name: The name of the dataset. If the dataset does not exist, it will be created with the specified name. Otherwise, it will be updated.
api_url: The URL to connect to the Argilla instance.
api_key: The API key to authenticate with the Argilla instance.
number_of_retrievals: The number of retrieved documents to be logged. Defaults to 0.
workspace_name: The name of the workspace to log the data. By default, the first available workspace.

For more information about the credentials, check the documentation for users and workspaces.

from llama_index.core import set_global_handler

set_global_handler(
    "argilla",
    dataset_name="query_model",
    api_url="http://localhost:6900",
    api_key="argilla.apikey",
    number_of_retrievals=2,
)

Let's log some data into Argilla. With the code below, you can create a basic LlamaIndex workflow. We will use GPT3.5 from OpenAI as our LLM (OpenAI API key). Moreover, we will use an example .txt file obtained from the Llama Index documentation.

import os 

from llama_index.core import Settings, VectorStoreIndex, SimpleDirectoryReader
from llama_index.llms.openai import OpenAI

# LLM settings
Settings.llm = OpenAI(
  model="gpt-3.5-turbo", temperature=0.8, openai_api_key=os.getenv("OPENAI_API_KEY")
)

# Load the data and create the index
documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)

# Create the query engine
query_engine = index.as_query_engine()

Now, let's run the query_engine to have a response from the model. The generated response will be logged into Argilla.

response = query_engine.query("What did the author do growing up?")
response

Name		Name	Last commit message	Last commit date
Latest commit History 134 Commits
.github/workflows		.github/workflows
data		data
docs		docs
src/argilla_llama_index		src/argilla_llama_index
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
LICENSE_HEADER		LICENSE_HEADER
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

✨🦙 Argilla's LlamaIndex Integration

Getting Started

Basic Usage

About

Releases 3

Packages

Contributors 4

Languages

License

argilla-io/argilla-llama-index

Folders and files

Latest commit

History

Repository files navigation

✨🦙 Argilla's LlamaIndex Integration

Getting Started

Basic Usage

About

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 4

Languages

Packages