Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add quality facets based on the opposite: property, and on mutually-exclusive tag values #10372

Closed
Tracked by #10272
teolemon opened this issue May 30, 2024 · 2 comments · Fixed by #10378
Closed
Tracked by #10272
Assignees
Labels
🧽 Data quality - Measure - Quality facets One of the facets available in Open Food Facts is /quality & allows us to spot products w/ bad data 🧽 Data quality - Prevention 🧽 Data quality https://wiki.openfoodfacts.org/Quality opposites - logic bombs 🎯 P1

Comments

@teolemon
Copy link
Member

teolemon commented May 30, 2024

What

  • Add quality facets based on the opposite: property, and on mutually-exclusive tag values
  • I have started listing intersections that should be empty or emptier (category X category, category X label, label X origin…): https://wiki.openfoodfacts.org/Data_quality_missions#Intersections_to_check_regularly
  • We (mostly @aleene and @stephanegigandet) have in the past added opposite: properties to taxonomy but never leveraged them
  • We could start generating additional quality facets based on those, that would be helpful for the Real-time Data Quality API, to prevent such logic bombs before they happen
@benbenben2
Copy link
Collaborator

Interesting. That would be warnings or errors? I guess, we can start with warnings, no?

Looking at the taxonomy, there are indeed such properties in labels.
They are used in Import.pm.

@teolemon
Copy link
Member Author

teolemon commented Jun 1, 2024

Yes, warnings are good enough. Ideally, some of those we could almost autofix based on other fields like ingredients.
Some brands belong to some countries, I just spotted a French retailer brand (Auchan) as sold in the US.
Hopefully, we'll add more constraints in various taxonomies, and we'll spot more of those. But I'm very seduced by the notion of "suggested fix"

@CharlesNepote CharlesNepote added the 🧽 Data quality https://wiki.openfoodfacts.org/Quality label Jun 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🧽 Data quality - Measure - Quality facets One of the facets available in Open Food Facts is /quality & allows us to spot products w/ bad data 🧽 Data quality - Prevention 🧽 Data quality https://wiki.openfoodfacts.org/Quality opposites - logic bombs 🎯 P1
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants