Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

text_splitter error #12

Open
matebenyovszky opened this issue Feb 7, 2023 · 3 comments
Open

text_splitter error #12

matebenyovszky opened this issue Feb 7, 2023 · 3 comments

Comments

@matebenyovszky
Copy link

hi, thanks for your work. I tried to test, but got this error:

pythondev1-ubuntu@pythondev1-ubuntu:~$ dr-doc-search --train -i ~/dr-doc-search/tests/data/kh.pdf
2023-02-07 20:57:32 - text_splitter.py:59 - Created a chunk of size 1339, which is longer than the specified 1000
@alexamirejibi
Copy link

I'm also getting this error.

@namuan
Copy link
Owner

namuan commented Feb 18, 2023

Is it just a warning or does it stop at this point?
Do you have the full stack trace if any?

@matebenyovszky
Copy link
Author

matebenyovszky commented Feb 23, 2023

It is just kind of warning in terminal (still version 1.6.0, different environment, same PDF, hungarian Tesseract pack):

ubuntu@ubuntu-Standard-PC-Q35-ICH9-2009:/etc/ImageMagick-6$ dr-doc-search -i ~/dr-doc-search/tests/data/kh.pdf --input-question "Mekkora a kamat pontosan?"
2023-02-08 01:20:48 - text_splitter.py:59 - Created a chunk of size 1339, which is longer than the specified 1000
Loading index from /home/ubuntu/OutputDir/dr-doc-search/kh/index/index.pkl
Question: Mekkora a kamat pontosan?
Answer:
A Kélesén ügyleti kamata valtozó, mértéke évi 3.09%. A kamatvaltoztatasi mutató H3K10.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants