-
Notifications
You must be signed in to change notification settings - Fork 7.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hallucination: Answers are not from the docs but from the model's own knowledge base #517
Comments
I asked a question out the context of state_of_the_union.txt :
This sure shows that the model returns the answers outside the source document. |
I agree with you so how can we fix that? I need the answer only from embeddings (on top of my docs), I don't want LLM to return the answer from the base knowledge. |
I also encountered this problem, that the answer was generated not on the basis of an indexed document, but from the knowledge base on which the model was trained. How can I fix this so that answers are generated only on the basis of indexed documents? |
Can anyone from the community help here? Maybe the creator or co-creators of the PrivateGPT? |
One possible way is to use a custom prompt. You have to modify the At the top of the file, add
Using Before
After
|
Thanks for your comment @Guillaume-Fgt , I appreciate that but even in the case of a custom prompt.... It doesn't work in the case of the MPT7B model and the default models shared in the repository. I have used. Let me try out the model you are suggesting. Can you share the model link so I can download it? |
Sure, here is the link. |
Seems to be the problem with llama.cpp recently changing file-formats. See #567 (comment) which might help to use the model. |
I ingested the state of the union text file along with my 2022 tax return in pdf format and when I asked "how long is the state of the union" it responded by returning with all the various numerical values it found from my 1040 form, from both the federal and the state. So yeah... something is definitely broken. |
For me, there is hallucination even after using what @Guillaume-Fgt wrote. |
Same here. - Truly hope this is part of a fix in the near future.. Queries to list the contents or chapters of a book do not faithfully render the original text. A query like will produce freely invented results every single time. For what it's worth, here are my Ubuntu Server 22.04 LTS INGESTION LOG - no errors: settings.yaml |
Hey folks, first of all, I just want to say thank you for such a fantastic project. I'm still having this issue 😟 . In my case, I have added the documentation (in MarkDown) of an internal project related to platform engineering (so Kubernetes, GitHub Actions, Terraform and the likes) and while adjusting parameters (I've found what works best for me is For instance, when asked how can we push an image to a repository (which the answer is using a provided, internal, GitHub action), the example it provides includes an action that doesn't exist (docker/docker) info: Image Replication
on:
push:
branches:
- main
jobs:
build-and-push:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v2
- name: Build and push image
uses: docker/docker@v2
with:
dockerfile: Dockerfile
push: true
registry: ecr
username: <your-username>
password: <your-password> My system chat and query prompts are the defaults, so I'm not sure why is not complying to them. I'm fairly new to all this ecosystem but I'm more than happy to try and spend time debugging and testing. Additional contextconfig llm:
mode: ollama
max_new_tokens: 512
context_window: 3900
temperature: 0.01
(...)
ollama:
llm_model: llama2
embedding_model: mxbai-embed-large
keep_alive: 5m
tfs_z: 1.0
top_k: 1
top_p: 0.1
repeat_last_n: 64
repeat_penalty: 0.9 It was happening as well using mistral as the LLM model and |
Hi @jnogol did you ever resolve this issue and if yes can you please share how? |
Describe the bug and how to reproduce it
The code base works completely fine. Fantastic work! I have tried different LLMs. Even after creating embeddings on multiple docs, the answers to my questions are always from the model's knowledge base. It is not returning the answers from the documents. I have defined a prompt template too but that doesn't work either. The source document is something that the model has used in the training part.
Expected behavior
If embeddings are created for docs, LLMs should return the answers from those only and if it can't answer, it must not return. It hallucinates like anything. How can I achieve the results only from the documents that I have created the embedding for?
Additional context
I have tried MPT-7B, GPT4ALL, LLaMA, etc. I have tried different chains in Langchain but nothing works.
The text was updated successfully, but these errors were encountered: