Table of contents
- Introduction
- Requirements
- OpenAI API Key
- Usage
- Bot Code
- How to run the bot
- How to make your own Chat Bot
- Example 1: Create a Youtube Channel Bot
- Example 2: Chat with Multiple PDF files
- Tech Stack
- Questions and Support
Welcome to Embedchain Chat Template tutorial for Replit. This repository includes the starter code to quickly get a bot running.
In this tutorial, we will create a Naval Ravikant Bot. This bot will have following context from the following sources.
- Naval Ravikant Joe Rogan Podcast
- The Almanack of Naval Ravikant
- Free Markets Provide the Best Feedback from Naval's blog
- More Compute Power Doesn’t Produce AGI from Naval's blog
- Question / Answer Pair:
- Q: Who is Naval Ravikant?
- A: Naval Ravikant is an Indian-American entrepreneur and investor.
If you want you can pick the data sources yourself. Right now embedchain supports three data types, namely
- pdf file
- web page
- youtube video
- question answer
If you want support for more data types, please open an issue.
We use OpenAI's embedding model to create embeddings for chunks and ChatGPT API as LLM to get answers given the relevant docs. Make sure that you have an OpenAI account and an API key. If you have don't have an API key, you can create one by visiting this link.
Once you have the API key, set it in a secret variable called OPENAI_API_KEY
in this repl. You can use this documentation link to learn how to set a secret in repl.
- You can find the code in
main.py
file.
- You can fork the template and then click on "Run" button.
- You will see some statements about chunks being created. If you want to learn more about the process, read How does it work in our readme, where we have explained the entire functionality.
- You can enter a query and get an answer from the dataset.
- In the tutorial we have explained how to create a Naval Ravikant Chat bot, but you can create your own just by updating datasets.
- In the video we will cover how can we create a youtube channel bot. We will take Ycombinator Youtube channel as an example and create a chat bot over all its videos.
- For this tutorial, we will use a python package called scrapetube which will get list of all videos
import scrapetube
from embedchain import App
ycombinator_bot = App()
# change this to whatever you channel you want to create bot
channel_url = 'https://www.youtube.com/@ycombinator'
videos = scrapetube.get_channel(channel_url=channel_url)
# taking only first 5 videos, if you want all, just remove [:5]
for video in videos[:5]:
url = f"https://www.youtube.com/watch?v={video['videoId']}"
ycombinator_bot.add("youtube_video", channel_url)
# YC has around 560+ videos and this will run it for all. Make sure to put a limit to your OpenAI account
ycombinator.query("What is product market fit?")
- Yes, using only this much code you can create a chat bot over any youtube video
-
We have all seen Mayo's tutorial on Chatting with Multiple PDF. We can replicate the same using few lines of code in embedchain
-
Here is the code to get Chat with multiple pdf bot up and running
from embedchain import App
chat_with_pdf_app = App()
chat_with_pdf_app.add("pdf_file", "https://digitalassets.tesla.com/tesla-contents/image/upload/IR/TSLA-Q4-2022-Update")
chat_with_pdf_app.add("pdf_file", "https://digitalassets.tesla.com/tesla-contents/image/upload/IR/TSLA-Q3-2022-Update")
chat_with_pdf_app.add("pdf_file", "https://digitalassets.tesla.com/tesla-contents/image/upload/IR/TSLA-Q2-2022-Update")
chat_with_pdf_app.query("What is tesla earning in 2022?")
- Please note one thing. Mayo has implemented multiple name spaces querying data in his tutorial videos. Its not supported as of now. If you want me to prioritize this, please leave your view on this GitHub issue
embedchain is built on the following stack:
- Langchain as an LLM framework to load, chunk and index data
- OpenAI's Ada embedding model to create embeddings
- OpenAI's ChatGPT API as LLM to get answers given the context
- Chroma as the vector database to store embeddings
- If you have any questions, join our discord, open an issue on GitHub or DM Taranjeet on twitter.
- If you like embedchain, you can star and watch it to stay updated with latest releases.
MIT License