Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the new ElasticSearch endpoint to provide as you type brand suggestions #619

Closed
1 task
Tracked by #218 ...
teolemon opened this issue Nov 10, 2022 · 13 comments
Closed
1 task
Tracked by #218 ...

Comments

@teolemon
Copy link
Member

teolemon commented Nov 10, 2022

What

Related web feature

Part of

@monsieurtanuki
Copy link
Contributor

@alexgarel No language or country parameters?

@monsieurtanuki
Copy link
Contributor

Besides, looking for "pizza" I've found a lot of duplicates, which does not make that much sense for autocomplete:

[
  {
    "product_name": "Pizza"
  },
  {
    "product_name": "Pizza"
  },
  {
    "product_name": "Pizza Margherita"
  },
  {
    "product_name": "Pizza"
  },
  {
    "product_name": "Pizza"
  },
  {
    "product_name": "Pizza"
  },
  {
    "product_name": "Pizza"
  },
  {
    "product_name": "Pizza"
  },
  {
    "product_name": "Pizza"
  },
  {
    "product_name": "Pizza"
  }
]
curl -X 'POST' \
  'https://search.openfoodfacts.net/autocomplete' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "response_fields": [
    "product_name"
  ],
  "num_results": 10,
  "text": "pizza",
  "search_fields": [
    "product_name",
    "brands",
    "categories"
  ]
}'

@simonj222
Copy link

Hey, I created that endpoint with the API in mind, which didn't have a strict ranking requirement. I should have some free time next week, will try porting the ranking logic.

We also have all of the data needed for a language/country search, so I can make sure that's exposed.

@monsieurtanuki
Copy link
Contributor

Hi @simonj222!

My Xmas list:

  • language and country parameters
  • de-duplication of results (as a parameter?)
  • ranking parameter (best popularity? most recent? number of duplicated occurrences?)

No rush, just ping me when you're done, and thank you!

@simonj222
Copy link

Quick update - was hoping to spend time on it this week, but the family got covid and then I'm back at work the week after, so probably won't have time to do this - sorry :(

It's worth saying that the current index should be able to serve the following features:

  • API responses
  • Our current search (but not with type-ahead)
    This is what it was designed for (particularly the API responses, via a new version of the API to preserve backwards compatibility with the existing API).

If someone wants to take on these additional features, it shouldn't be too tricky:

language and country parameters

This is exposed in several places in the product: https://github.com/openfoodfacts/openfoodfacts-search/blob/main/app/models/product.py - you might have some luck with the StringFilter already.

de-duplication of results (as a parameter?)

I think we need to decide on the auto-complete/type-ahead behavior (I currently don't see any type-ahead behavior on https://world.openfoodfacts.org/). Do we want it to:

  • Intelligently suggest words and then display the search result site
  • Intelligently suggest actual results, and then link directly to the product page

I suspect we want the latter, in which case I don't think we want de-duplication, but want very aggressive ranking. If we want the former (not actual products), then we probably need to re-think things, but I also think it's less helpful.

ranking parameter (best popularity? most recent? number of duplicated occurrences?)

I think this is key - probably worth starting off with most popular by views (iirc that's what the current search does, but worth verifying). This is non-trivial since we'd need to update the view counts (this isn't currently stored). So, we'd probably want to:

  • Add this field
  • Index this field
  • Modify our indexing logic to index everything daily or weekly (currently it's just when something is modified).

From here, using it in ranking should be reasonably doable, with some manual tuning to find the right function.

@monsieurtanuki
Copy link
Contributor

Thank you @simonj222 for your verbose feedback. I hope everything is now OK for your family, covid-wise.

@teolemon
Copy link
Member Author

teolemon commented Jun 2, 2023

@simonj222 @monsieurtanuki It would really be nice to have this one up, in our quest to simplify new product addition

@teolemon
Copy link
Member Author

  • @monsieurtanuki @simonj222 I thought we could just retrieve all values for brands, which shouldn't yield problematic duplicates
  • The endpoint seems down, do you have some context @alexgarel

@alexgarel
Copy link
Member

@teolemon I'm not sure it's a good idea to restore the service now, as we are expected to move it quite a lot in next monthes.

@teolemon
Copy link
Member Author

teolemon commented Nov 9, 2023

We now have an autocomplete feature deployed in staging, thanks to @Frank Baele !
You can give it a try here: https://search.openfoodfacts.net/autocomplete?q=la+fabr&taxonomy_names=brand&lang=en (brand autocomplete)
It works for most existing taxonomies. For brands, we use the "pseudo-taxonomy" file, and it only works with "en" lang, but for the other languages you can provide the lang you want to
Please note that we don't want to switch for anything but brands at the moment, since there are some regressions on accentuations.

cc @raphael0202 @monsieurtanuki @g123k

@g123k
Copy link
Contributor

g123k commented Nov 13, 2023

We now have an autocomplete feature deployed in staging, thanks to @Frank Baele ! You can give it a try here: search.openfoodfacts.net/autocomplete?q=la+fabr&taxonomy_names=brand&lang=en (brand autocomplete) It works for most existing taxonomies. For brands, we use the "pseudo-taxonomy" file, and it only works with "en" lang, but for the other languages you can provide the lang you want to Please note that we don't want to switch for anything but brands at the moment, since there are some regressions on accentuations.

cc @raphael0202 @monsieurtanuki @g123k

To what extent should we implement this in the Dart package?
Is-it too early?

@teolemon
Copy link
Member Author

The syntax will not change, nor will the output, the only thing changing is .net to .org when it's deployed to production
cc @raphael0202

@monsieurtanuki
Copy link
Contributor

Closed by #835.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

5 participants