Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug with os.environ['FAISS_NO_AVX2'] = '1' #8857

Closed
14 tasks
aiakubovich opened this issue Aug 7, 2023 · 4 comments
Closed
14 tasks

Bug with os.environ['FAISS_NO_AVX2'] = '1' #8857

aiakubovich opened this issue Aug 7, 2023 · 4 comments
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature Ɑ: embeddings Related to text embedding models module Ɑ: vector store Related to vector store module

Comments

@aiakubovich
Copy link

aiakubovich commented Aug 7, 2023

System Info

I was trying to create FAISS embeddings that would work on different platforms so I tried to use:
os.environ['FAISS_NO_AVX2'] = '1'
as recommended in https://github.com/langchain-ai/langchain/blob/6cdd4b5edca511b0015f1b39102225fe638d8359/langchain/vectorstores/faiss.py

It works for windows, but I am getting TypeError: IndexFlatCodes.add() missing 1 required positional argument: 'x' when I try to create embeddings in Docker image

Full error:

TypeError: IndexFlatCodes.add() missing 1 required positional argument: 'x'
Traceback:
File "/usr/local/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script
    exec(code, module.__dict__)
File "/app/src/pages/1_💬__AI-Chat.py", line 127, in <module>
    chatbot = utils.setup_chatbot(
              ^^^^^^^^^^^^^^^^^^^^
File "/app/./src/modules/utils.py", line 121, in setup_chatbot
    vectors = embeds.getDocEmbeds(file, uploaded_file.name)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/./src/modules/embedder.py", line 104, in getDocEmbeds
    self.storeDocEmbeds(file, original_filename)
File "/app/./src/modules/embedder.py", line 86, in storeDocEmbeds
    vectors = FAISS.from_documents(data, embeddings)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/langchain/vectorstores/base.py", line 336, in from_documents
    return cls.from_texts(texts, embedding, metadatas=metadatas, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/langchain/vectorstores/faiss.py", line 550, in from_texts
    return cls.__from(
           ^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/langchain/vectorstores/faiss.py", line 505, in __from
    index.add(vector)

langchain==0.0.226

Who can help?

@hwchase17

Information

  • The official example notebooks/scripts
  • My own modified scripts

Related Components

  • LLMs/Chat Models
  • Embedding Models
  • Prompts / Prompt Templates / Prompt Selectors
  • Output Parsers
  • Document Loaders
  • Vector Stores / Retrievers
  • Memory
  • Agents / Agent Executors
  • Tools / Toolkits
  • Chains
  • Callbacks/Tracing
  • Async

Reproduction

from langchain.document_loaders.csv_loader import CSVLoader
from langchain.vectorstores import FAISS
from langchain.embeddings.openai import OpenAIEmbeddings

loader = CSVLoader(file_path=tmp_file_path, encoding="utf-8",csv_args={
'delimiter': ',',})
data = loader.load()
embeddings = OpenAIEmbeddings(...)
vectors = FAISS.from_documents(data, embeddings)

Expected behavior

embeddings should generated

@dosubot dosubot bot added Ɑ: embeddings Related to text embedding models module Ɑ: vector store Related to vector store module 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature labels Aug 7, 2023
@dosubot
Copy link

dosubot bot commented Aug 7, 2023

Answer generated by a 🤖

Answer

Hello,

Thank you for providing detailed information about the issue you're facing. The error message you're seeing, TypeError: IndexFlatCodes.add() missing 1 required positional argument: 'x', is indicating that the add() method of the IndexFlatCodes class is expecting an argument that it's not receiving.

The add() method is used in the FAISS.__from() method in the LangChain codebase to add vectors to the FAISS index. Here's the relevant part of the code:

vector = np.array(embeddings, dtype=np.float32)
if normalize_L2 and distance_strategy == DistanceStrategy.EUCLIDEAN_DISTANCE:
    faiss.normalize_L2(vector)
index.add(vector)

In this code, vector is a numpy array created from the list of embeddings. This array is then passed to the add() method of the index object, which is an instance of a FAISS index class (IndexFlatIP or IndexFlatL2).

The error message suggests that the add() method is not receiving this vector argument. This could be due to a number of reasons, but without more information, it's hard to say exactly what's causing the issue.

One possibility is that the embeddings list is empty, which would result in vector being an empty numpy array. This could happen if there was an issue with the from_documents() method that was used to generate the embeddings.

Another possibility is that there's an issue with the FAISS library itself. You mentioned that you're setting the FAISS_NO_AVX2 environment variable to '1' to ensure compatibility across different platforms. This should load the FAISS library without AVX2 optimization, but it's possible that there's a bug or incompatibility issue that's causing the add() method to fail.

To help troubleshoot this issue, could you please provide more information about the data you're using and the exact steps you're taking to generate the embeddings and create the FAISS index? Specifically, it would be helpful to know:

  • What is the content of the data variable that you're passing to the from_documents() method?
  • What are the arguments you're passing to the OpenAIEmbeddings() constructor?
  • Are you able to successfully generate embeddings outside of the Docker environment?

This additional information will help us better understand the issue and provide a more accurate solution.

Best regards,
Dosu

Sources


This response is meant to be useful, save you time, and share context. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

Copy link

dosubot bot commented Nov 6, 2023

Hi, @aiakubovich! I'm Dosu, and I'm helping the LangChain team manage our backlog. I wanted to let you know that we are marking this issue as stale.

Based on my understanding, you are experiencing a bug when trying to create FAISS embeddings in a Docker image. You mentioned that you are getting a TypeError related to the IndexFlatCodes.add() method, even after setting os.environ['FAISS_NO_AVX2'] = '1'. You reached out to @hwchase17 for help, and I suggested that the issue could be caused by an empty embeddings list or a bug/incompatibility in the FAISS library. I asked for more information about the data and steps taken to generate the embeddings and create the FAISS index.

Before we proceed, we would like to confirm if this issue is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on this issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days. Thank you!

@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Nov 6, 2023
@blue-hope
Copy link
Contributor

blue-hope commented Nov 12, 2023

I think when turn on the FAISS_NO_AVX2 as "1",
the function dependable_faiss_import returns faiss from from faiss import swigfaiss as faiss

and it doesn't be wrapped by handle_Index on __init__.py of 'faiss', so signature doesn't match at all.

@dosubot dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Nov 12, 2023
Copy link

dosubot bot commented Feb 11, 2024

Hi, @aiakubovich

I'm helping the LangChain team manage our backlog and am marking this issue as stale. From what I understand, I am encountering a TypeError when attempting to create FAISS embeddings in a Docker image by setting os.environ['FAISS_NO_AVX2'] = '1'. There have been responses from @hwchase17, @dosubot, and @blue-hope, discussing potential causes of the issue and requesting more information about the data and steps taken to generate the embeddings and create the FAISS index. However, the issue remains unresolved.

Could you please confirm if this issue is still relevant to the latest version of the LangChain repository? If it is, please let the LangChain team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.

Thank you for your understanding and cooperation.

@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Feb 11, 2024
@dosubot dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 18, 2024
@dosubot dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Feb 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature Ɑ: embeddings Related to text embedding models module Ɑ: vector store Related to vector store module
Projects
None yet
Development

No branches or pull requests

2 participants