Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

taxonomy: fix taxonomies for lines double for same langage #9902

Merged
merged 2 commits into from
Mar 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 15 additions & 2 deletions scripts/taxonomies/sort_each_taxonomy_entry.pl
Original file line number Diff line number Diff line change
Expand Up @@ -147,11 +147,24 @@ ($$)
# synonym
elsif ($line =~ /^(\w+):[^:]*(,.*)*$/) {
if (!defined $entry_id_line) {
$entry_id_line = {line => $line, previous => [@previous_lines]};
$entry_id_line = {line => $line, previous => [@previous_lines], lc => $1};
}
else {
my $lc = $1;
$entries{$lc} = {line => $line, previous => [@previous_lines]};
if ((defined $entries{$lc}) || ($entry_id_line->{lc} eq $lc)) {
# emit a warning as this seems like a strange case
print STDERR "Warning: duplicate synonym for $lc, on entry line $line_num\n";
print STDERR "- " . ($entries{$lc}{line} // $entry_id_line->{line});
print STDERR "- " . $line;
}
# but try to do our best and continue
if (defined $entries{$lc}) {
$entries{$lc}{line} = $entries{$lc}{line} . $line;
push @{$entries{$lc}{previous}}, @previous_lines;
}
else {
$entries{$lc} = {line => $line, previous => [@previous_lines]};
}
}
@previous_lines = ();
}
Expand Down
7 changes: 2 additions & 5 deletions taxonomies/additives.txt
Original file line number Diff line number Diff line change
Expand Up @@ -17753,7 +17753,7 @@ vegetarian:en:yes

#comment:en:E553 is a natural OR synthetic form of magnesium silicate
en:E553, Magnesium silicates, magnesium silicate
fr:E553, Silicates de magnésium
fr:E553, Silicate de magnésium, Silicates de magnésium
bg:E553, Магнезиев силикат
cs:E553, Křemičitan hořečnatý
da:E553, Magnesiumsilicat
Expand All @@ -17762,7 +17762,6 @@ el:E553, Πυριτικο μαγνησιο
es:E553, Silicato magnésico
et:E553, Magneesiumsilikaat
fi:E553, Magnesiumsilikaatti, Magnesiumsilikaattia
fr:E553, Silicate de magnésium
hu:E553, Magnézium-szilikát, Magnézium-szilikátok
it:E553, Silicato di magnesio
lt:E553, Magnio silikatas
Expand Down Expand Up @@ -17896,7 +17895,6 @@ lv:E553b, Talks
mt:E553b, Terra
ml:E553b, ടാൽക്
ms:E553b, Talkum
mt:E553b, Terra
nb:E553b, talk
nl:E553b, Talk, Steatiet, Speksteen, Talkpoeder
oc:E553b, Talc
Expand Down Expand Up @@ -19067,8 +19065,7 @@ sh:E631, Dinatrijum inozinat
sk:E631, Inozínan disodný, Inozínan sodný
sl:E631, Dinatrijev inozinat, natrijev inozinat
sr:E631, dinatrijum-inozitat
sv:E631, Dinatriuminosinat, Natriuminosinat
sv:E631, dinatriuminosinat, E631 dinatriuminosinat
sv:E631, E631 dinatriuminosinat, Dinatriuminosinat, Natriuminosinat
tr:E631, disodyum inosinat
vi:E631, Natri-II inosinat
xx:E631
Expand Down
9 changes: 1 addition & 8 deletions taxonomies/data_quality.txt
Original file line number Diff line number Diff line change
Expand Up @@ -687,10 +687,6 @@ es:Valor nutricional superior a 105 - Sal
de:Nährwert über 105 - Salz
it:Valore nutrizionale superiore a 105 - Sale
description:en:Salt value is over 105g per 100g, which is impossible.
fr:Valeur nutritionnelle supérieure à 105 - Sel
es:Valor nutricional superior a 105 - Sal
de:Nährwert über 105 - Salz
it:Valore nutrizionale superiore a 105 - Sale

<en:Nutrition errors
en:Nutrition value over 105 - Salt prepared
Expand All @@ -699,10 +695,6 @@ es:Valor nutricional superior a 105 - Sal preparada
de:Nährwert über 105 - Salz zubereitet
it:Valore nutrizionale superiore a 105 - Sale preparate
description:en:Salt prepared value is over 105g per 100g, which is impossible.
fr:Valeur nutritionnelle supérieure à 105 - Sel préparé
es:Valor nutricional superior a 105 - Sal preparada
de:Nährwert über 105 - Salz zubereitet
it:Valore nutrizionale superiore a 105 - Sale preparate

<en:Nutrition errors
en:Nutrition value over 105 - Saturated fat
Expand Down Expand Up @@ -1338,6 +1330,7 @@ description:en:There are 7 different languages in one ingredient list. For insta
description:fr:Il y a 7 langues différentes dans une liste d'ingrédients. Par exemple, la liste d'ingrédients en français ne devrait inclure que la langue déclarée, et les autres langues devraient être déplacées vers une liste d'ingrédients pour l'allemand, l'italien...
description:es:Hay 7 idiomas diferentes en una lista de ingredientes. Por ejemplo, la lista de ingredientes en francés solo debería incluir el idioma declarado, y los otros idiomas deberían moverse a una lista de ingredientes para alemán, italiano...
description:de:In einer Zutatenliste gibt es 7 verschiedene Sprachen. Zum Beispiel sollte die Zutatenliste auf Französisch nur die angegebene Sprache enthalten, und andere Sprachen sollten in eine Zutatenliste für Deutsch, Italienisch usw. verschoben werden...

<en:Ingredients warnings
en:Ingredients - Number of languages - above 1
#description:en:
Expand Down
69 changes: 16 additions & 53 deletions taxonomies/food/categories.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1484,7 +1484,6 @@ cs:Mirin
da:Mirin
de:Mirin, Reiswein
el:Μιρίν
en:Mirin
eo:Mirino
es:Mirin
fa:میرین, چاشنی
Expand Down Expand Up @@ -3758,20 +3757,19 @@ origins:en: en:Slovenia
#wikidata:en:

<en:Fruit brandy
pl:Palinca
en:Pálinka
pl:Palinca, Pálinka
be:Палінка
bg:Палинка
ca:Aiguardent de fruita
cv:Палинка
de:Pálinka
en:Pálinka
fi:Pálinka
fr:Pálinka
hu:Pálinka
it:Pálinka
lt:Palinka
nl:Palinka
pl:Pálinka
pt:Palinca
ro:Palincă
ru:Палинка
Expand Down Expand Up @@ -9556,10 +9554,6 @@ es:Café de Laos
it:Caffè del Laos
nl:Koffie uit Laos
pt:Café do Laos
es:Café de Laos
it:Caffè del Laos
nl:Koffie uit Laos
pt:Café do Laos
origins:en: en:Laos

<en:Coffees
Expand All @@ -9568,11 +9562,6 @@ fr:Café du Nicaragua
es:Café de Nicaragua
it:Caffè del Nicaragua
nl:Koffie uit Nicaragua
pt:Café da Nicarágua
es:Café de Nicaragua
it:Caffè del Nicaragua
nl:Koffie uit Nicaragua
pt:Café da Nicarágua
origins:en: en:Nicaragua

<en:Coffees
Expand Down Expand Up @@ -10756,10 +10745,6 @@ nl:Kookwijnen
es:Vinos de cocina
de:Kochweine
it:Vini da cucina
fr:Vins de cuisine
nl:Kookwijnen
es:Vinos de cocina
de:Kochweine
gpc_category_code:en:10000052
gpc_category_name:en:Cooking Wines
gpc_category_description:en:Definition: Includes any products that can be described/observed as a typically inferior variety of wine to Drinking Wine, sometimes adulterated with salt, which is used in cooking to enhance the flavour or colour of a prepared recipe. These products are specifically labelled and marketed as cooking wine, and their alcohol content ensures that they do not need to be refrigerated.Definition Excludes: Excludes products such as Wine or Sherry Alcoholic Beverages, and any alcohol not specifically used for cooking.
Expand Down Expand Up @@ -10815,7 +10800,7 @@ da:Franske mousserende vine
de:Französische Schaumweine
es:Vinos espumosos franceses
fi:ranskalaiset kuohuviinit
fr:Vins effervescents français
fr:Vins pétillants français, Vins effervescents français
he:יינות קידוח צרפתיים
hr:Francuska pjenušava vina
hu:Francia pezsgők
Expand All @@ -10830,14 +10815,7 @@ sv:Franska mousserande viner
th:ไวน์อัดแก๊สฝรั่งเศส
tr:Fransız şampanyaları
zh:法国起泡酒
fr:Vins pétillants français
es:Vinos espumosos franceses
de:Französische Schaumweine
it:Spumanti francesi
ja:フランス産スパークリングワイン
nl:Franse mousserende wijnen
ru:Французские игристые вина
zh:法国起泡酒

<en:Wines
en:Red wines, red wine
Expand Down Expand Up @@ -10883,8 +10861,6 @@ de:10% Rotwein
nl:10% rode wijn
ru:10% красное вино
zh:10% 红葡萄酒
es:Vino tinto al 10%
de:10% Rotwein
agribalyse_food_code:en:5203
ciqual_food_code:en:5203
ciqual_food_name:en:Wine, red, 10°
Expand All @@ -10896,9 +10872,9 @@ fr:Vin rouge à 11°, Vin rouge à 11 degrés
bg:11% червено вино
ca:Vi negre 11%
de:11% Rotwein
es:Vino tinto 11%
es:Vino tinto 11%, Vino tinto al 11%
fi:11% punaviini
it:Vino rosso 11%
it:Vino rosso 11%, Vino rosso al 11%
ja:11% 赤ワイン
lt:11% raudonas vynas
nl:11% rode wijn
Expand All @@ -10907,9 +10883,6 @@ pt:Vinho tinto 11%
ru:11% красное вино
tr:11% kırmızı şarap
zh:11% 红葡萄酒
es:Vino tinto al 11%
de:11% Rotwein
it:Vino rosso al 11%
ciqual_food_code:en:5204
ciqual_food_name:en:Wine, red, 11°
ciqual_food_name:fr:Vin rouge 11°
Expand Down Expand Up @@ -14070,9 +14043,11 @@ fr:Chassagne-Montrachet premier cru Bois de Chassagne, Chassagne-Montrachet Bois
origins:en: en:France, fr:Bourgogne
protected_name_type:en: pdo
#wikidata:en:

<fr:Chassagne-Montrachet
fr:Chassagne-Montrachet premier cru Bois de Chassagne rouge
wikidata:en:Q42866482

<fr:Chassagne-Montrachet
fr:Chassagne-Montrachet premier cru Bois de Chassagne blanc
wikidata:en:Q42866481
Expand Down Expand Up @@ -36516,10 +36491,9 @@ ciqual_food_name:fr:Cigarette
#ciqual_food_name:fr:Biscuit sec type langue de chat ou cigarette russe

<en:Cakes
en:Madeleines, Madeleines cakes
en:Madeleines, Madeleines cakes, Madeleine biscuit
ca:Magdalenes
de:Madeleine
en:Madeleine biscuit
es:Magdalenas
fi:Madeleine-leivokset
fr:Madeleines, Madeleine ordinaire
Expand Down Expand Up @@ -39925,7 +39899,6 @@ de:Kaviar vom Acipenser gueldenstaedtii
<en:Caviars
en:Wild caviars
de:Wildkaviar
en:Wild caviars
it:Caviali selvatici
nl:Wilde kaviaar

Expand Down Expand Up @@ -41452,7 +41425,7 @@ es:Espeltas, Escaña mayor, Escanda mayor
et:speltanisu, spelta
eu:espelta
fa:اسپلت
fi:Speltit, Spelttijauho
fi:Speltit, Spelttijauho, spelttivehnä, speltti, spelttiä
fr:Épeautres, Épeautre, Grands épeautres, Grand épeautre
ga:speilt
gl:espelta
Expand Down Expand Up @@ -41503,7 +41476,6 @@ agribalyse_food_code:en:9001
ciqual_food_code:en:9001
ciqual_food_name:en:Spelt, raw
ciqual_food_name:fr:Épeautre, cru
fi:spelttivehnä, speltti, spelttiä


<en:Cereal grains
Expand Down Expand Up @@ -43324,6 +43296,7 @@ hu:Pasztőrözött camembert
it:Camembert pastorizzato
nl:Gepasteuriseerde camemberts
wikidata:en:Q47472053

<en:Camemberts
en:Camembert with Calvados
de:Camembert mit Calvados
Expand Down Expand Up @@ -51259,8 +51232,7 @@ ciqual_food_name:fr:Gâteau mousse de fruits sur génoise, type miroir, bavarois


<en:Desserts
en:Liégeois
en:chocolate custard topped with whipped cream refrigerated, coffee custard topped with whipped cream refrigerated, caramel custard topped with whipped cream refrigerated, vanilla custard topped with whipped cream refrigerated
en:Liégeois, chocolate custard topped with whipped cream refrigerated, coffee custard topped with whipped cream refrigerated, caramel custard topped with whipped cream refrigerated, vanilla custard topped with whipped cream refrigerated
fr:Liégeois
nl:Toetjes met roomlaag
agribalyse_food_code:en:19681
Expand Down Expand Up @@ -55259,7 +55231,7 @@ ciqual_food_name:en:Pork fat, raw
ciqual_food_name:fr:Lard gras, cru

<en:Animal fats
nl:Goose fats
en:Goose fats
de:Gänsefette
fr:Graisse d'oie, gras d'oie, graisse d'oies, gras d'oies
hu:Liba zsírok
Expand Down Expand Up @@ -61203,7 +61175,7 @@ ciqual_food_name:en:Gelatine, dried
ciqual_food_name:fr:Gélatine, sèche

<en:Thickeners
nl:Plantbased gelatin
en:Plantbased gelatin
fr:Gélifiants végétaux, gélifiant végétal
lt:Augalinė želatina
nl:Plantaardige gelatines
Expand Down Expand Up @@ -80521,7 +80493,7 @@ ciqual_food_name:en:Veal escalope cordon bleu (topped with a ham slice and Gruye

<en:Cordons bleus
<en:Chicken preparations
fr:Chicken cordons bleus
en:Chicken cordons bleus
bg:Пилешки кордон бльо
fr:Cordons bleus de poulet
nl:Kip cordon bleus
Expand Down Expand Up @@ -90397,7 +90369,6 @@ fr:Sandwichs à la volaille
nl:Gevogelte sandwiches, pluimvee sandwiches
es:Bocadillos de aves, Sándwiches de aves
it:Panini di carne avicola, Panini al pollame
pt:Sanduíches de aves
bg:Сандвичи с птици
pt:Sanduíches de aves

Expand Down Expand Up @@ -95296,7 +95267,7 @@ gpc_category_name:en:Jams/Marmalades/Fruit Spreads (Perishable)
gpc_category_description:en:Definition: Includes any products that can be described/observed as a sweet semi-firm liquid, usually used as a spread, made by cooking and blending crushed fruit in sugar, and allowing the mixture to set, often with the addition of setting agents. Includes jams, jellies, and marmalades. These products must be refrigerated to extend their consumable life. Definition Excludes: Excludes products such as Confectionery Based Spreads and Honey, and Jams/Marmalades (Shelf Stable).

<en:Fruit and vegetable preserves
fr:Fruit preserves with chocolate
en:Fruit preserves with chocolate
fr:Confitures au chocolat

#uncooked fruit preserve
Expand Down Expand Up @@ -96000,15 +95971,12 @@ hr:Marmelada
hu:Marmelád
it:Marmellate, marmellata
ja:マーマレード
it:Marmellate
nl:Marmelades
pt:Marmeladas
lv:Marmelāde
lt:Marmeladas
mt:marmellata
nl:Marmelades, Marmelade
pl:Marmolada
pt:Marmeladas
ro:Marmelada
sk:Marmeláda
sl:Marmelada
Expand All @@ -96024,7 +95992,6 @@ el:μαρμελάδα εσπεριδοειδών
es:Mermeladas y confituras de cítricos
et:Tsitrusmarmelaad
fi:sitrushillot
fi:sitrushillot
fr:Confitures d’agrumes, Confiture d’agrumes
hu:Citrus lekvárok
it:Confetture di agrumi
Expand Down Expand Up @@ -97449,7 +97416,6 @@ ciqual_food_name:fr:Fructose
en:Glucose
es:Glucosa
bg:Глюкоза
es:Glucosa
fr:Glucose
ja:グルコース
nl:Glucose
Expand Down Expand Up @@ -99234,7 +99200,6 @@ ciqual_food_name:fr:Chou vert, cuit
<en:Cabbages
en:Red cabbage
bg:Червено зеле
bg:Червено зеле
de:Rotkraut
es:Lombardas, Lombarda, Col lombarda
fi:punakaali
Expand Down Expand Up @@ -100611,7 +100576,6 @@ en:Canned sweet red peppers
bg:Червени чушки от консерва
fr:Poivrons rouges en conserve, Poivrons rouges appertisés
es:Pimientos rojos en conserva
fr:Poivrons rouges en conserve
nl:Rode paprikas in blik/pot
agribalyse_food_code:en:20275
ciqual_food_code:en:20275
Expand Down Expand Up @@ -108394,10 +108358,9 @@ nl:Taartdegen, taartdeeg
intake24_category_code:en:PSDO

<en:Pie dough
en:Shortcrust pastry
en:Shortcrust pastry, Baked shortcrust pastry
fr:Pâtes brisées, Pâte brisée, Pâtes à tarte brisées
ja:パート・ブリゼ
en:Baked shortcrust pastry
nl:Zanddegen
nl_be:Zanddeeg
agribalyse_food_code:en:23410
Expand Down
Loading
Loading