Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Price/proof addition : location field : restrict OSM types #5568

Closed
raphodn opened this issue Sep 5, 2024 · 14 comments · Fixed by #5587
Closed

Price/proof addition : location field : restrict OSM types #5568

raphodn opened this issue Sep 5, 2024 · 14 comments · Fixed by #5587
Labels

Comments

@raphodn
Copy link
Member

raphodn commented Sep 5, 2024

Some users are adding prices and linking them to non-desired locations like cities, countries, roads...

In the Open Prices web frontend we remove some OSM POI results using a blacklist, to avoid having proofs/prices linked to a non-shop location.

The list can be seen here : https://github.com/openfoodfacts/open-prices-frontend/blob/master/src/constants.js > NOMINATIM_RESULT_TYPE_EXCLUDE_LIST
Linked frontend issue : openfoodfacts/open-prices-frontend#37

If needed, we could move this blacklist to the backend, and have it available via an API endpoint ?

@teolemon teolemon added the Prices label Sep 8, 2024
@monsieurtanuki
Copy link
Contributor

@raphodn I've just run some default search - https://photon.komoot.io/api?q=berlin&bbox=9.5,51.5,11.5,53.5
In that case, which "type" are you referring to, "type": "street" or "osm_key": "highway"?
Neither value is part of your exclusion list...

{
            "geometry": {
                "coordinates": [
                    9.7517684,
                    52.3738781
                ],
                "type": "Point"
            },
            "type": "Feature",
            "properties": {
                "osm_id": 185830087,
                "extent": [
                    9.7517684,
                    52.3738781,
                    9.7517784,
                    52.3734873
                ],
                "country": "Allemagne",
                "city": "Hanovre",
                "countrycode": "DE",
                "postcode": "30159",
                "locality": "Bult",
                "county": "Région de Hanovre",
                "type": "street",
                "osm_type": "W",
                "osm_key": "highway",
                "district": "Ville-Sud-Bult",
                "osm_value": "secondary",
                "name": "Berliner Allee",
                "state": "Basse-Saxe"
            }
        },

Btw wouldn't an inclusion list or a search filter be more relevant, like https://photon.komoot.io/api?q=berlin&bbox=9.5,51.5,11.5,53.5&osm_tag=shop?

@raphodn
Copy link
Member Author

raphodn commented Sep 8, 2024

The exclusion list looks at :

  • Photon : properties.osm_value (secondary in your example)
  • Nominatim : type

And what is stored (and displayed) afterwards per location :

  • Photon : properties.osm_key:properties.osm_value
  • Nominatim : class:type

Bonus : the current top 30 location types in OP
https://github.com/openfoodfacts/open-prices/wiki/Stats#top-location-osm-types

wouldn't an inclusion list or a search filter be more relevant

there are prices everywhere, not only supermarkets, but also pharmacies, restaurants, bakeries, bookstores... is it possible to have dozens in the inclusion list url ?

@monsieurtanuki
Copy link
Contributor

there are prices everywhere, not only supermarkets, but also pharmacies, restaurants, bakeries, bookstores... is it possible to have dozens in the inclusion list url ?

Looks so: https://photon.komoot.io/api?q=berlin&bbox=9.5,51.5,11.5,53.5&osm_tag=amenity:pharmacy&osm_tag=shop&limit=100

@monsieurtanuki
Copy link
Contributor

@raphodn According to your stats, I guess we would be ok in a first approach filtering on amenity and shop, right?
https://photon.komoot.io/api?q=carrefour,%20paris&osm_tag=amenity&osm_tag=shop&limit=100

@raphodn
Copy link
Member Author

raphodn commented Sep 9, 2024

the stats are just for info, there are 900+ locations so only a subset show up in the top 30 types, at no moment did I say we should restrict... but they DO show that places like house or city have been used as locations.

Recently I've been adding prices in restaurants, greengrocers, diy shops, bars, pharmacies... so i'm in favor of as much choices as we can give the user.

So probably sticking with a blacklist rather than a (long aka "dozens" as stated above) whitelist

@monsieurtanuki
Copy link
Contributor

at no moment did I say we should restrict

You meant a blacklist that doesn't restrict?

Besides

@raphodn
Copy link
Member Author

raphodn commented Sep 9, 2024

oook my bad, i re-read the whole thread, I understand now what you mean, by filtering on the osm_tag only, which is much less restrictive than osm_value. Just need to test a bit but your photon url looks good 💯

the full list of location types : https://gist.github.com/raphodn/7c53c4a3403f0f86e89f09e9e7a7ddaf

@monsieurtanuki
Copy link
Contributor

@raphodn Of course for better stat analysis we should see how many prices are used with "right" or "wrong" locations, but from what I could see in your list whitelisting to shop and amenity looks reasonable.

shop
shop:supermarket:621
shop:convenience:77
shop:chemist:13
shop:variety_store:13
shop:bakery:7
shop:mall:7
shop:deli:7
shop:frozen_food:6
shop:greengrocer:5
shop:furniture:5
shop:department_store:4
shop:farm:4
shop:wholesale:4
shop:books:3
shop:sports:3
shop:doityourself:3
shop:newsagent:2
shop:gift:2
shop:beauty:1
shop:garden_centre:1
shop:computer:1
shop:kiosk:1
shop:cheese:1
shop:dairy:1
shop:electronics:1
shop:car_repair:1
shop:hardware:1
shop:clothes:1
shop:ticket:1
shop:interior_decoration:1
shop:travel_agency:1
shop:pasta:1
shop:toys:1
shop:health_food:1
shop:general:1
shop:outpost:1
amenity
amenity:fuel:18
amenity:pharmacy:6
amenity:fast_food:3
amenity:cafe:3
amenity:bar:2
amenity:university:2
amenity:restaurant:2
amenity:post_office:2
amenity:charging_station:1
amenity:place_of_worship:1
amenity:bicycle_rental:1
amenity:parking:1
amenity:bus_station:1
amenity:food_court:1
amenity:community_centre:1
amenity:ice_cream:1
amenity:casino:1
amenity:veterinary:1
no shop and no amenity
::11
boundary:administrative:25
boundary:local_authority:2
boundary:political:2
building:commercial:2
building:retail:2
building:supermarket:1
building:yes:4
highway:bus_stop:7
highway:motorway_junction:1
highway:pedestrian:1
highway:primary:1
highway:residential:2
highway:secondary:2
historic:fort:1
landuse:cemetery:1
landuse:commercial:2
landuse:construction:5
landuse:farmyard:1
landuse:greenfield:1
landuse:industrial:4
landuse:residential:1
landuse:retail:6
leisure:stadium:1
man_made:bridge:1
man_made:street_cabinet:1
man_made:surveillance:2
natural:peak:1
office:company:2
place:city:3
place:hamlet:1
place:house:7
place:plot:1
place:suburb:3
place:town:2
railway:station:6
railway:yard:1

@raphodn
Copy link
Member Author

raphodn commented Sep 9, 2024

ok following this discussion I opened a PR in the web frontend.

I had a look at the building, for instance : https://www.openstreetmap.org/way/174737917
They should be labeled as shop in the case of supermarket or equivalent, so the problem is on OSM side, but being too restrictive might discourage some contributors... (instead of allowing more POIs, and fixing afterwards).

Do you think we should add building in the whitelist ? Or just keep it to shop & amenity for now ?

@monsieurtanuki
Copy link
Contributor

The question is: how to deal with crap OSM data, before we put it in Prices and after.

As a user, I find it very painful to find "carrefour" shops in Paris because both "carrefour" and "Paris" have different meanings:

Therefore I would really appreciate being able to enter just "carrefour paris" without ambiguity when talking about shops.

That said, "your" LIDL cannot be found as a shop.
We could introduce an optional "advanced" search mode, without shop/amenity filter, for obvious cases where OSM data is slightly flawed?

In parallel, we may enhance the whitelist.

@raphodn
Copy link
Member Author

raphodn commented Sep 9, 2024

After a few hours thought (and some previous discussions on the subject), I would go with :

  • restrict OSM search results to shop and amenity (and maybe put some info bubble for transparency)
  • if no results, allow the user to input a simple string (describing where he is / shop / city) (feature in the backend needed)

@monsieurtanuki
Copy link
Contributor

I may have an even better solution:

  • query with shop and amenity filter
  • query without filters
  • results to the user: filtered then non filtered

2 API calls being transparent for the user that sees a single sorted list.

@raphodn
Copy link
Member Author

raphodn commented Sep 10, 2024

So you would combine the 2 results ? it might "bloat" the results.

but the idea of opening up the search is good, it could be a user action;

  • default search restricts to shop and amenity (and maybe put some info bubble for transparency)
  • with a small link-button saying "not found ? wider search" that does a second search
  • if still no results, allow the user to input a simple string (describing where he is / shop / city) (feature in the backend needed)

@monsieurtanuki
Copy link
Contributor

@raphodn This is what I have in mind:

  • a more relevant search (only shop/amenity)
  • if the user doesn't find the location, an additional search with no filter

With this we provide the user with a better UX (e.g. "carrefour paris" really delivers shops), while letting the OSM data being slightly crappy.

What do you think of that?

relevant search first unfiltered results if needed
Screenshot_1726077190 Screenshot_1726077197

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging a pull request may close this issue.

3 participants