Skip to content

oracle-devrel/langchain-oci-genai

Repository files navigation

How to work with LangChain and Oracle Generative AI

License: UPL

In this demo, we will learn how to work with #LangChain's open-source building blocks, components & **#LLMs **integrations, #Streamlit, an open-source #Python framework for #DataScientists and AI/ML engineers and #OracleGenerativeAI to build the next generation of Intelligent Applications.

We will lay the groundwork by setting up the OCI Command line interface, Creating a simple LongChain app, writing an interactive #Healthcare application, and creating a search application that can scan through a training #PDF document.

What is LangChain?

LangChain is a framework for developing applications powered by large language models (LLMs).

LangChain simplifies every stage of the LLM application lifecycle:

  • Development: Build your applications using LangChain's open-source building blocks and components. Hit the ground running using third-party integrations and Templates.
  • Productionization: Use LangSmith to inspect, monitor and evaluate your chains, so that you can continuously optimize and deploy with confidence.

With LangChain we can build context-aware reasoning applications with LangChain's flexible framework that leverages your company's data and APIs. Future-proof your application by making vendor optionality part of your LLM infrastructure design.

What is Streamlit?

A faster way to build and share data apps. Streamlit turns data scripts into shareable web apps in minutes. All in pure Python. No front‑end experience is required.

How does LangChain work with Oracle Generative AI?

This is a 3-step process.

  1. Set up the OCI Command Line Interface on your laptop or use a Cloud Compute Instance.
  2. Install required Python libraries.
  3. Write and Run the Python code.

Assumption:

  1. You already have an Oracle cloud account and access to Oracle Generative AI in the Chicago region.
  2. You have a default compartment created.
  3. You have administrative rights in the tenancy or your Administrator has set up required policies to access Generative AI Services.
  4. Download Python Code and Budget PDF. used in this article.

Step 1: Installing OCI CLI and Configuring

Get User's OCID

  1. After logging into the cloud console, click on User Settings under the top right navigation; this will take you to the User Details page. Copy the OCID into a text file. We will need this later:

  2. On the same page click on Add API Key button:

Generate and Download RSA Key Pair

  1. Choose the option to Generate a new key pair; if you already have keys generated, you can upload them here. The most important thing is to download both of them when you create a new key pair. Click the Add button:

    You should now be able to see the Configuration file. Copy and paste this into a file we will need later:

    Click on the Close button.

  2. We can now see our newly created fingerprint:

Get Compartment OCID

Under Identity > Compartments, note your compartment OCID; we will need this OCID later:

Install OCI Command Line Interface (CLI)

  1. Install OCI CLI on MAC OS

  2. If you have not installed Brew on your MacOS, please refer to their official guide, Install Brew on Mac:

     /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
    
     brew update && brew install oci-cli
    

    For Other Operating systems and more details on OCI CLI please check this link.

Update OCI Configuration file

Update DEFAULT values as per your OCI parameters:

$ vi ~/.oci/config

The parameters will look something like this:

[DEFAULT]
user=ocid1.user.oc1..aaaaaaXXXX63smy6knndy5q
fingerprint=d5:84:a7:0e:XXXX:ed:11:1a:50
tenancy=ocid1.tenancy.oc1..aaaaaaaaj4XXXymyo4xwxyv3gfa
region=us-phoenix-1 #(replace-with-your-oci-region)
key_file=/your-folder/your-oci-key.pem

See if we can list all buckets in a compartment to check if all configurations are correct. Provide your compartment ocid where the OCI buckets have been created:

# set chmod on .pem file
$ chmod 600 /your-folder/your-oci-key.pem

# get tenancy namespace
$ oci os ns get

{
  "data": "yournamespace"
}

# run oci cli to list buckets in a OCI Object storage 
$ oci os bucket list --compartment-id ocid1.compartment.oc1..aXXXn4hgg

# expect a similar JSON output without errors

  {
      "compartment-id": "ocid1.compartment.oc1..aXXX32q", 
      "defined-tags": null,
      "etag": "25973f28-5125-4eae-a37c-73327f5c2644",
      "freeform-tags": null,
      "name": "your-bucket-name",
      "namespace": "your-tenancy-namespace",
      "time-created": "2023-03-26T16:18:17.991000+00:00"
    }

If this lists objects in an OCI bucket or the name of your tenancy namespace, we are good to move forward; you can create a bucket in OCI Object storage and test it. If there are issues, check the troubleshooting section in this article.

Step 2: Install the required Python libraries

Add any additional libraries as required to run the Python code:

$ python3 --version
Python 3.10.4

$ pip install -U langchain oci
langchain-core
langchain-cli
langchain_cohere 
cohere
langgraph 
langsmith
streamlit 

Step 3: Write and Run the Python code.

The basic command to get the Oracle LLM handle is shown below:

from langchain_community.llms import OCIGenAI 

...

llm = OCIGenAI(
    model_id="cohere.command",
    service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
    compartment_id="<Your-Compartment-Id>",
    model_kwargs={"temperature": 0.7, "top_p": 0.75, "max_tokens": 1000}
)

To get OCI Generative AI Embeddings:

from langchain_community.llms import OCIGenAI 
from langchain_community.embeddings import OCIGenAIEmbeddings

...

embeddings = OCIGenAIEmbeddings(
    model_id="cohere.embed-english-light-v3.0",
    service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com", 
    compartment_id="<Your-Compartment-Id>",
)

Examples

Example 1: Simple Generative AI Console App

Let's start with a basic Python code in the console to check the connectivity, authentication and AI response:

Download Basic.py run the Python code, and view output:

$ python3 basic.py

The Egyptians built the pyramids. The Egyptian pyramids are ancient pyramid-shaped masonry structures located in Egypt.

Example 2: Simple Generative AI Web Application with Input

In this simple Generative AI Application, we will just take input prompt and display AI result:

Download quickstart.py and run the code:

Run the code from your laptop or desktop command prompt:

$ streamlit run quickstart.py

You can now view your Streamlit app in your browser.

Local URL: http://localhost:8501
Network URL: http://192.168.X.Y:8501

Example 3: Simple Healthcare AI Application

Build your first LangChain and Oracle Generative AI-based Healthcare Application. In this Application, we will take disease as input, and based on this, we will create a LangChain that will list the disease, symptoms and first aid.

Get the required imports and initialise Oracle Generative AI LLM:

Build the Streamlit framework and input prompts:

Initialise Memory and Chains:

Set SequentialChain and Print output on the browser:

Download symptoms.py and run the code.

Run the Application:

$ streamlit run symptoms.py

You can now view your Streamlit app in your browser.

Local URL: http://localhost:8501
Network URL: http://192.168.X.Y:8501

Example 4: PDF Search AI Application

A smart application that can read and search PDFs and provide AI output for a given prompt. Download and put the Budget PDF in the same directory as the Python code.

Prompt: What is Amrit Kaal? lets check in the PDF document

from langchain_community.document_loaders import PyPDFLoader
from langchain_community.llms import OCIGenAI 
from langchain_core.prompts import PromptTemplate
from langchain.schema.output_parser import StrOutputParser
from langchain.schema.runnable import RunnablePassthrough
from langchain_community.embeddings import OCIGenAIEmbeddings
from langchain_community.vectorstores import Chroma
  
loader = PyPDFLoader("budget_speech.pdf")
pages = loader.load_and_split()

llm = OCIGenAI(
    model_id="cohere.command",
    service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
    compartment_id="<Your-Compartment-Id>",
    model_kwargs={"temperature": 0.7, "top_p": 0.75, "max_tokens": 1000}
)

embeddings = OCIGenAIEmbeddings(
    model_id="cohere.embed-english-light-v3.0",
    service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
    compartment_id="<Your-Compartment-Id>",
)

vectorstore = Chroma.from_documents(
    pages,
    embedding=embeddings    
)
retriever = vectorstore.as_retriever()

template = """ 
{context}
Indian Budget Speech : {input} 
"""
prompt = PromptTemplate.from_template(template)
 

chain = (
    {"context": retriever, 
     "input": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

print(chain.invoke("What is Amrit Kaal"))

Run the code in the terminal:

$ python3 searchpdf.py

-- Output from Generative AI after searching the PDF --

The Indian Budget Speech outlines the government's plans and priorities for the fiscal year 2024-25. One of the key focus areas highlighted in the speech is the concept of "Amrit Kaal," which translates to "Era of Immortality" or "Golden Era." 

The government aims to foster inclusive and sustainable development to improve productivity, create opportunities for all, and contribute to the generation of resources to power investments and fulfill aspirations. 

To achieve this, the government intends to adopt economic policies that facilitate growth and transform the country. This includes ensuring timely and adequate finances, relevant technologies, and appropriate training for Micro, Small and Medium Enterprises (MSME) to compete globally.  

....

Change the prompt and re-run the same code:

  1. Update the code

    print(chain.invoke("What is India's fiscal deficit"))
  2. Run the code:

    $ python3 searchpdf.py

AI Output after searching through the Budget PDF is:

India's fiscal deficit for the year 2024-25 is estimated to be 5.1% of GDP, according to a budget speech by the country's Finance Minister Nirmala Sitharaman. This is in line with the government's goal to reduce the fiscal deficit to below 4.5% by 2025-26. 

Would you like me to tell you more about India's budgetary process or fiscal policies?

Example 5: PDF Search AI Application Streamlit version

import streamlit as st
from langchain_community.llms import OCIGenAI 
from langchain_community.document_loaders import PyPDFLoader
from langchain_core.prompts import PromptTemplate
from langchain_community.embeddings import OCIGenAIEmbeddings
from langchain_community.vectorstores import Chroma

st.title('πŸ¦œπŸ”— PDF AI Search Application')
 
def generate_response(input_text): 
  loader = PyPDFLoader("budget_speech.pdf")
  pages = loader.load_and_split()
  llm = OCIGenAI(
    model_id="cohere.command",
    service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
    compartment_id="<Your-Compartment-Id>",
    model_kwargs={"temperature": 0.7, "top_p": 0.75, "max_tokens": 300}
  )  
  embeddings = OCIGenAIEmbeddings(
    model_id="cohere.embed-english-light-v3.0",
    service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
    compartment_id="<Your-Compartment-Id>",
  ) 
  vectorstore = Chroma.from_documents(
        pages,
        embedding=embeddings    
    )
  retriever = vectorstore.as_retriever()
  template = """ 
  {context}
  Indian Budget Speech : {input} 
  """
  prompt = PromptTemplate.from_template(template) 
  st.info(llm(input_text))

with st.form('my_form'):
  text = st.text_area('Enter Search:','What is Amrit Kaal?')
  submitted = st.form_submit_button('Submit') 
generate_response(text)

#streamlit run pdfsearch.py

Download Python Code and Budget PDF used

Troubleshooting:

Error Message:

Traceback (most recent call last):
  File "/some-folder/basic.py", line 3, in <module>
    llm = OCIGenAI(
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain_core/load/serializable.py", line 120, in __init__
    super().__init__(**kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pydantic/v1/main.py", line 341, in __init__
    raise validation_error
pydantic.v1.error_wrappers.ValidationError: 1 validation error for OCIGenAI
__root__
  Could not authenticate with OCI client. Please check if ~/.oci/config exists. If INSTANCE_PRINCIPLE or RESOURCE_PRINCIPLE is used, Please check the specified auth_profile and auth_type are valid. (type=value_error)

Solution: This is an Authentication issue, Verify the config file settings:

$ vi ~/.oci/config

[DEFAULT]
user=ocid1.user.oc1..aaaaaaaaompuufgfXXXXndy5q
fingerprint=d5:84:a7:0e:bf:43:XXXXX:11:1a:50
tenancy=ocid1.tenancy.oc1..aaaaaaaaXXXXXgfa
region=us-phoenix-1
key_file=/Users/somefolder/oci_api_key_xxx.pem

References

Contributing

This project welcomes contributions from the community. Before submitting a pull request, please review our contribution guide.

Security

Please consult the security guide for our responsible security vulnerability disclosure process.

License

Copyright (c) 2024 Oracle and/or its affiliates.

Licensed under the Universal Permissive License (UPL), Version 1.0.

See LICENSE for more details.

ORACLE AND ITS AFFILIATES DO NOT PROVIDE ANY WARRANTY WHATSOEVER, EXPRESS OR IMPLIED, FOR ANY SOFTWARE, MATERIAL OR CONTENT OF ANY KIND CONTAINED OR PRODUCED WITHIN THIS REPOSITORY, AND IN PARTICULAR SPECIFICALLY DISCLAIM ANY AND ALL IMPLIED WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. FURTHERMORE, ORACLE AND ITS AFFILIATES DO NOT REPRESENT THAT ANY CUSTOMARY SECURITY REVIEW HAS BEEN PERFORMED WITH RESPECT TO ANY SOFTWARE, MATERIAL OR CONTENT CONTAINED OR PRODUCED WITHIN THIS REPOSITORY. IN ADDITION, AND WITHOUT LIMITING THE FOREGOING, THIRD PARTIES MAY HAVE POSTED SOFTWARE, MATERIAL OR CONTENT TO THIS REPOSITORY WITHOUT ANY REVIEW. USE AT YOUR OWN RISK.

Releases

No releases published

Packages

No packages published

Languages