You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
We have document titled "1st go live". It can be found with "1st", with "live", with "1st live", but search result is empty for "go live" and "1st go live".
It's due to an inconsistency between javascript stemmer and python one.
Javascript accepts short words like "go" and looks for them in the index file. Python however explicitly removes rejects anything shorter than 3 characters.
Well, python implementation is also caveating for unicode and cardinal numbers, but that's not important.
To Reproduce
Steps to reproduce the behavior:
create document with "1st go live" in text
search for "1st go live"
Expected behavior
find the page either by matching "1st" and "live" or by matching all 3 words "1st", "go", and "live".
Describe the bug
We have document titled "1st go live". It can be found with "1st", with "live", with "1st live", but search result is empty for "go live" and "1st go live".
It's due to an inconsistency between javascript stemmer and python one.
Javascript accepts short words like "go" and looks for them in the index file. Python however explicitly removes rejects anything shorter than 3 characters.
Well, python implementation is also caveating for unicode and cardinal numbers, but that's not important.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
find the page either by matching "1st" and "live" or by matching all 3 words "1st", "go", and "live".
Your project
Proposed fix https://github.com/uktrade/sphinx/tree/stemmer-len3
Screenshots
N/A
Environment info
Additional context
N/A
The text was updated successfully, but these errors were encountered: