match `/` character #381

pierrevdb · 2022-01-31T08:34:52Z

pierrevdb
Jan 31, 2022

When searching for a term that contains a '/', the text before and after the '/' are matched separately but the entire term is not.

I couldn't find any documentation relating to this specific issue.

Steps to reproduce

Add a document that contains an attribute with a '/', for example {"ref": "ABC/12345"}
Perform a search using the API
The resulting _matchesInfo is [{"ref"=>[{"start"=>0, "length"=>3}, {"start"=>4, "length"=>5}]}]

Using the built-in web interface produces the same result with the text before and after the '/' highlighted.

Expected behaviour
Should match the entire term

Meilisearch version: [e.g. v0.23.1]

Kerollmops · 2022-01-31T09:53:41Z

Kerollmops
Jan 31, 2022
Maintainer

Hey @pierrevdb,

Meilisearch generates tokens based on the text, those tokens are split by using a list of separators that you can find on our documentation page. This list of separators is currently not customizable but you can upvote this feature on our roadmap page, it will help us prioritize it.

0 replies

curquiza · 2022-01-31T10:45:03Z

curquiza
Jan 31, 2022
Maintainer

Hello @pierrevdb
I moved your issue to the right repo to be sure the missing feature you need is taken into consideration :)

0 replies

pierrevdb · 2022-01-31T11:22:09Z

pierrevdb
Jan 31, 2022
Author

Thanks guys. Not sure how I missed that page of the documentation.

For our application we need to search many reference fields that contain document numbers. These almost always contain chars that are listed as soft/hard spaces, so being able to work around these would be advantageous. It is also conceivable that simply removing the chars for indexing would result in collisions that would render the search results ambiguous or less accurate, so that is not really an option.

I will see whether giving proximity a higher priority has the desired effect for us, and will look at the roadmap as @Kerollmops suggested for the relevant features around customising/ignoring tokens.

0 replies

macraig · 2023-08-02T12:58:13Z

macraig
Aug 2, 2023
Maintainer

Hello everyone 👋

We just released a 🧪 prototype that allows customizing tokenization and we'd love your feedback.

How to get the prototype?

Using docker, use the following command:

docker run -p 7700:7700 -v $(pwd)/meili_data:/meili_data getmeili/meilisearch:prototype-tokenizer-customization-2

From source, compile Meilisearch on the prototype-tokenizer-customization-2 tag

How to use the prototype?

You can find all the details in the PR.

⚠️ We do NOT recommend using this prototype in production. This is for test purposes only.

Feedback and bug reporting when using this prototype are encouraged! Thanks in advance for your involvement. It means a lot to us ❤️

0 replies

macraig · 2023-08-29T07:46:29Z

macraig
Aug 29, 2023
Maintainer

Hello everyone 👋

We have just released the first RC (release candidate) of Meilisearch containing this new feature!

You can test it by using:

The release assets
The Meilisearch Docker image

docker run -it --rm -p 7700:7700 -v $(pwd)/meili_data:/meili_data getmeili/meilisearch:v1.4.0-rc.0

You are welcome to leave your feedback in this discussion.

If you encounter any bugs, please report them here.
Thanks in advance for your help and your involvement in Meilisearch ❤️

🎉 Official and stable release containing this change will be available on September 25th, 2023

⚠️ RC (release candidates) are not recommended for production

0 replies

macraig · 2023-09-26T08:07:35Z

macraig
Sep 26, 2023
Maintainer

Hey folks 👋

v1.4.0 has been released! 🦓 You can now customize tokenization by adding or removing tokens from the list of separator and non-separator tokens. ✨

Note:

📚 Separator tokens
📚 Non-separator tokens

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Meilisearch

match `/` character #381

{{title}}

Replies: 6 comments

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Meilisearch

match / character #381

pierrevdb Jan 31, 2022

Replies: 6 comments

Kerollmops Jan 31, 2022 Maintainer

curquiza Jan 31, 2022 Maintainer

pierrevdb Jan 31, 2022 Author

macraig Aug 2, 2023 Maintainer

How to get the prototype?

How to use the prototype?

macraig Aug 29, 2023 Maintainer

macraig Sep 26, 2023 Maintainer

match `/` character #381

pierrevdb
Jan 31, 2022

Kerollmops
Jan 31, 2022
Maintainer

curquiza
Jan 31, 2022
Maintainer

pierrevdb
Jan 31, 2022
Author

macraig
Aug 2, 2023
Maintainer

macraig
Aug 29, 2023
Maintainer

macraig
Sep 26, 2023
Maintainer