Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR:root:IndexFlat.search() missing 3 required positional arguments: 'k', 'distances', and 'labels' #61

Open
gp48maz1 opened this issue Apr 21, 2023 · 23 comments

Comments

@gp48maz1
Copy link

I am not sure where this is coming from? I was thinking maybe FAISS?

Also is this project supposed to work? I feel like I have dumped 3-4 hours into making updates for a variety of issues

@abegon-ta
Copy link

same

@jocca1985
Copy link

Having similar issue:
ERROR:root:search() missing 3 required positional arguments: 'k', 'distances', and 'labels'

@NikhilSehgal123
Copy link

Anyway to resolve this? I'm getting the same issue

@NikhilSehgal123
Copy link

NikhilSehgal123 commented Apr 27, 2023

This is what happens (in my opinion):

FAISS has different wheels for arm64 chips & Intel/AMD chips. If you vectorise a document using arm64, you should use FAISS to interact with this document on a arm64 machine.

So right now, my mac is arm64 and I vectorised docs on my mac using the arm64 wheel. However in the Azure functions, it attempts to use the Intel wheel which is slightly different. Intel/AMD chips use something called AVX2 support in FAISS (no idea what it actually is but it's different to the way arm64 machines use FAISS).


So how can you solve this?
Re-create the vectorised documents again on your current PC and it will work

@NikhilSehgal123
Copy link

I opened an issue for the faiss-cpu lib here: kyamagu/faiss-wheels#74

@completelyboofyblitzed
Copy link

I'll also share that I had the error when I created the vectorstore using windows cmd and its environment but then ran make start with bash for windows. When I deleted the vectorstore and did everything through bash only there was no error.

@aiakubovich
Copy link

same problem when build app on windows and next put into docker container....

@Softtech247
Copy link

I had the same issue. I created my vectorstore on my window machine but when I deployed it on the AWS Ubuntu server, I had the issue. I resolved it by creating a new vectorestore(knowledgebase) on Ubuntu. Everything works well.

@abegon-ta
Copy link

me, too

@polybius12
Copy link

+1

@fidoarg
Copy link

fidoarg commented Sep 14, 2023

Same

@aboutmydreams
Copy link

aboutmydreams commented Dec 12, 2023

My computer is arm64. I stored the vector in the index and pkl files locally, then deployed it to the Linux environment, and built an API for searching the index and pkl files in different situations, for larger file, I found that an error will appear when calling the search API for the second time, but smaller files do not have this problem.

my code:

from langchain.embeddings import OpenAIEmbeddings
from langchain import LLMChain
from langchain.text_splitter import CharacterTextSplitter
from langchain.chat_models import AzureChatOpenAI
from langchain import PromptTemplate
import faiss
import numpy as np
import pickle
import logging
from langchain.vectorstores import FAISS
import os
import ast

class CustomLangChain:

  def __init__(self, azure_account='ntt'):
    self.azure_account = azure_account
    account_config = get_azure_account(azure_account)
    self.embedding = OpenAIEmbeddings(
      model=account_config["embeded_deploy_name"], chunk_size=1)

    self.index = None
    self.chat_llm = AzureChatOpenAI(
      temperature=0.12,
      openai_api_version=account_config["gpt_35_turbo_version"],
      deployment_name=account_config["gpt_35_turbo_deploy_name"])

  def save_vector_from_txt(self, txt_file_path):
    docs = []
    space_name = txt_file_path.split('/')[-1].replace('.txt', '')
    with open(txt_file_path, 'r') as f:
      txt_text = f.read()
      f.close()
    textSplitter = CharacterTextSplitter(chunk_size=300, separator="\n")
    docs.extend(textSplitter.split_text(txt_text))
    store = FAISS.from_texts(docs, self.embedding)
    faiss.write_index(store.index, f'{current_dir}/models/{space_name}.index')
    store.index = None
    with open(f'{current_dir}/models/{space_name}_doc.pkl', "wb") as f:
      # 将 store 和 docs 一起保存
      pickle.dump((store, docs), f)
      f.close()

  def search(self, question, space_name, top_k=5):
    logging.info("==========================")
    print("==========================")
    logging.info(self.azure_account)
    print(self.azure_account)
    logging.info(question, space_name, top_k)
    print(question, space_name, top_k)
    logging.info("========================")
    print("==========================")
    # 1. 加载保存的FAISS索引和数据
    index = faiss.read_index(f'{current_dir}/models/{space_name}.index')
    with open(f'{current_dir}/models/{space_name}_doc.pkl', "rb") as f:
      loaded_data = pickle.load(f)
      store: FAISS = loaded_data[0]
      docs: str = loaded_data[1]
      f.close()

    store.index = index
    # 2. 为查询问题生成向量
    query_vector = self.embedding.embed_query(question)

    # 如果query_vector是列表,则转换为numpy数组并调整其形状
    if isinstance(query_vector, list):
      query_vector = np.array(query_vector).reshape(1, -1)

    # 3. 查询FAISS索引以获取最相似的文档索引
    D, I = store.index.search(query_vector, top_k)

    # 4. 获取最相似的结果
    results = [{
      "doc": parse_to_dict(docs[i]),
      "similarity": str(D[0][idx])
    } for idx, i in enumerate(I[0])]

    return results

  def answer(self,
             question: str,
             space_name="space_name",
             prompt_name="prompt_name",
             history=[]):

    try:
      # 根据 prompt 模板、历史文本、问题,构建预测使用的 LLM 模型
      with open(f"{current_dir}/prompt/{prompt_name}.txt", "r") as f:
        promptTemplate = f.read()
        f.close()

      prompt = PromptTemplate(
        template=promptTemplate,
        input_variables=["history", "context", "question"])

      llmChain = LLMChain(
        prompt=prompt,
        verbose=True,
        llm=self.chat_llm,
      )

      # 从文件中读取搜索引擎、向量空间等信息
      index = faiss.read_index(f"{current_dir}/models/{space_name}.index")
      with open(f"{current_dir}/models/{space_name}_doc.pkl", "rb") as f:
        loaded_data = pickle.load(f)
        store: FAISS = loaded_data[0]
        docs: str = loaded_data[1]
        f.close()
      store.index = index

      # 通过搜索引擎获取文本上下文信息,结合模型产生回答
      # docs = store.similarity_search(question)
      # contexts = []
      # contexts_str_list = []
      # for i, doc in enumerate(docs):
      #   contexts.append(f"Context {i}:\n{doc.page_content}")
      #   contexts_str_list.append(doc.page_content)
      # 复用本地方法
      search_result = self.search(question, space_name, 3)
      contexts = [
        f"Context {index}:\n{doc['doc']}"
        for index, doc in enumerate(search_result)
      ]
      contexts_str_list = [doc["doc"] for doc in search_result]

      answer = llmChain.predict(question=question,
                                context="\n\n".join(contexts),
                                history=history)
      response = {
        "answer": answer,
        "history": history + [f"Human: {question}", f"Bot: {answer}"],
        "contexts": contexts_str_list,
      }
      return response
    except Exception as e:
      logging.error("CustomLangChain.answer", e)
      return {
        "answer": f"Sorry, I couldn't understand you. Due to {str(e)}",
        "history": history + [f"Human: {question}", f"Bot: {str(e)}"],
      }

error output:

Traceback (most recent call last):
  File "/home/runner/azurecustomdataai/azure_chat.py", line 159, in answer
    docs = store.similarity_search(question)
  File "/home/runner/azurecustomdataai/venv/lib/python3.10/site-packages/langchain/vectorstores/faiss.py", line 207, in similarity_search
    docs_and_scores = self.similarity_search_with_score(query, k)
  File "/home/runner/azurecustomdataai/venv/lib/python3.10/site-packages/langchain/vectorstores/faiss.py", line 177, in similarity_search_with_score
    docs = self.similarity_search_with_score_by_vector(embedding, k)
  File "/home/runner/azurecustomdataai/venv/lib/python3.10/site-packages/langchain/vectorstores/faiss.py", line 151, in similarity_search_with_score_by_vector
    scores, indices = self.index.search(np.array([embedding], dtype=np.float32), k)
TypeError: IndexFlat.search() missing 3 required positional arguments: 'k', 'distances', and 'labels'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/nix/store/hd4cc9rh83j291r5539hkf6qd8lgiikb-python3-3.10.8/lib/python3.10/logging/__init__.py", line 1100, in emit
    msg = self.format(record)
  File "/nix/store/hd4cc9rh83j291r5539hkf6qd8lgiikb-python3-3.10.8/lib/python3.10/logging/__init__.py", line 943, in format
    return fmt.format(record)
  File "/nix/store/hd4cc9rh83j291r5539hkf6qd8lgiikb-python3-3.10.8/lib/python3.10/logging/__init__.py", line 678, in format
    record.message = record.getMessage()
  File "/nix/store/hd4cc9rh83j291r5539hkf6qd8lgiikb-python3-3.10.8/lib/python3.10/logging/__init__.py", line 368, in getMessage
    msg = msg % self.args
TypeError: not all arguments converted during string formatting
Call stack:
  File "/nix/store/hd4cc9rh83j291r5539hkf6qd8lgiikb-python3-3.10.8/lib/python3.10/threading.py", line 973, in _bootstrap
    self._bootstrap_inner()
  File "/nix/store/hd4cc9rh83j291r5539hkf6qd8lgiikb-python3-3.10.8/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/nix/store/hd4cc9rh83j291r5539hkf6qd8lgiikb-python3-3.10.8/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/nix/store/hd4cc9rh83j291r5539hkf6qd8lgiikb-python3-3.10.8/lib/python3.10/socketserver.py", line 683, in process_request_thread
    self.finish_request(request, client_address)
  File "/nix/store/hd4cc9rh83j291r5539hkf6qd8lgiikb-python3-3.10.8/lib/python3.10/socketserver.py", line 360, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/nix/store/hd4cc9rh83j291r5539hkf6qd8lgiikb-python3-3.10.8/lib/python3.10/socketserver.py", line 747, in __init__
    self.handle()
  File "/home/runner/azurecustomdataai/venv/lib/python3.10/site-packages/werkzeug/serving.py", line 390, in handle
    super().handle()
  File "/nix/store/hd4cc9rh83j291r5539hkf6qd8lgiikb-python3-3.10.8/lib/python3.10/http/server.py", line 432, in handle
    self.handle_one_request()
  File "/nix/store/hd4cc9rh83j291r5539hkf6qd8lgiikb-python3-3.10.8/lib/python3.10/http/server.py", line 420, in handle_one_request
    method()
  File "/home/runner/azurecustomdataai/venv/lib/python3.10/site-packages/werkzeug/serving.py", line 362, in run_wsgi
    execute(self.server.app)
  File "/home/runner/azurecustomdataai/venv/lib/python3.10/site-packages/werkzeug/serving.py", line 323, in execute
    application_iter = app(environ, start_response)
  File "/home/runner/azurecustomdataai/venv/lib/python3.10/site-packages/flask/app.py", line 2213, in __call__
    return self.wsgi_app(environ, start_response)
  File "/home/runner/azurecustomdataai/venv/lib/python3.10/site-packages/flask/app.py", line 2190, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/runner/azurecustomdataai/venv/lib/python3.10/site-packages/flask/app.py", line 1484, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/runner/azurecustomdataai/venv/lib/python3.10/site-packages/flask/app.py", line 1469, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "main.py", line 70, in azure_chat
    response = CustomLangChain(azure_account=model_name).answer(
  File "/home/runner/azurecustomdataai/azure_chat.py", line 175, in answer
    logging.error("CustomLangChain.answer", e)
Message: 'CustomLangChain.answer'
Arguments: (TypeError("IndexFlat.search() missing 3 required positional arguments: 'k', 'distances', and 'labels'"),)

I also discovered another strange phenomenon. For larger index and pkl files, this method can stably reproduce the error reported on the second api call.

     # 通过搜索引擎获取文本上下文信息,结合模型产生回答
      # docs = store.similarity_search(question)
      # contexts = []
      # contexts_str_list = []
      # for i, doc in enumerate(docs):
      #   contexts.append(f"Context {i}:\n{doc.page_content}")
      #   contexts_str_list.append(doc.page_content)

However, through this method, the same error will be reported for the first time. The problem occurs in D, I = store.index.search(query_vector, top_k)

      # 复用本地方法
      search_result = self.search(question, space_name, 3)
      contexts = [
        f"Context {index}:\n{doc['doc']}"
        for index, doc in enumerate(search_result)
      ]
      contexts_str_list = [doc["doc"] for doc in search_result]

      answer = llmChain.predict(question=question,
                                context="\n\n".join(contexts),
                                history=history)

@Satishchekuri273
Copy link

I got the same error while deploying my app on render. This is the issue with faiss. First I used faiss-cpu==1.7.4 and then changed it to faiss-cpu==1.7.2 in requirements file and the issue was solved.

@suhasml
Copy link

suhasml commented Feb 19, 2024

I got the same error while deploying my app on render. This is the issue with faiss. First I used faiss-cpu==1.7.4 and then changed it to faiss-cpu==1.7.2 in requirements file and the issue was solved.

You're a lifesaver. I was dealing with this issue for almost 4 hours and wasn't sure why. Then read about faiss-cpu having different wheels in different OS and downgrading it to 1.7.2 fixed the problem. :))

@doubtfire009
Copy link

I run my code in Windows Conda and all is OK. I tried in Ubuntu 20 and encountered the same issue. So I downgraded faiss-cpu to 1.7.2 and everything is OK

@doubtfire009
Copy link

I got the same error while deploying my app on render. This is the issue with faiss. First I used faiss-cpu==1.7.4 and then changed it to faiss-cpu==1.7.2 in requirements file and the issue was solved.

You're a lifesaver. I was dealing with this issue for almost 4 hours and wasn't sure why. Then read about faiss-cpu having different wheels in different OS and downgrading it to 1.7.2 fixed the problem. :))

I also fixed the issue on Ubuntu 20. Thanks!

@scrwghub
Copy link

Is there going to be any solution to this absolute amateur hour issue other than version locking to 1.7.2?

@balakarthick-bk
Copy link

balakarthick-bk commented May 16, 2024

Hi All there
i found a solution for that how about checking the versions of faiss-cpu==1.7.4

@brdwrd
Copy link

brdwrd commented Sep 12, 2024

I'd like to bump this issue. I'm running a job on my m3 mac to build the search index and then deploy it in linux with docker. This is incredibly annoying that the same versions are not compatible across architectures.

@scrwghub
Copy link

scrwghub commented Sep 12, 2024 via email

@ShriyaAgrawal
Copy link

Having similar issue: ERROR:root:search() missing 3 required positional arguments: 'k', 'distances', and 'labels'

Please update the Faiss Version and try it again i was facing similar issue with version 1.8.0 after i downgraded it to 1.7.2 it resolved my issue..

@Satishchekuri273
Copy link

Satishchekuri273 commented Sep 30, 2024 via email

@ShriyaAgrawal
Copy link

I got the same error while deploying my app on render. This is the issue with faiss. First I used faiss-cpu==1.7.4 and then changed it to faiss-cpu==1.7.2 in requirements file and the issue was solved.

You're a lifesaver. I was dealing with this issue for almost 4 hours and wasn't sure why. Then read about faiss-cpu having different wheels in different OS and downgrading it to 1.7.2 fixed the problem. :))

I also fixed the issue on Ubuntu 20. Thanks!

It resolved for me as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests