-
-
Notifications
You must be signed in to change notification settings - Fork 279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor the use of database #55
Comments
There are 2 distinct uses:
|
Regarding the database, the technical document that @PrimaelQuemerais has written for the app has a large section on why we need one and what we want to do with it: https://docs.google.com/document/d/1dOryz657AaAC7feaVMWvzdxB8RQ1Ze8O7PEk1QiJUww/edit#heading=h.u53wgh648h3k |
Disclaimer: I'm a former dev Oracle DBA, and I spent years creating, loading and querying relational databases with optimization in mind. It doesn't mean I know everything and I'm always right, it means I can be trusted regarding relational database (e.g. SQLite). I don't know that much about NoSQL. A typical use of the full product database is to get the product attached to a barcode (or products attached to barcodes). It's a normal use of a relational database, whose core is the "primary key" (a unique and non null value), and which is optimized for that (through the transparent use of "unique index"). The good thing about Sembast is that it supports An additional good thing about Sembast is that, as it stores JSON, we may be able to run queries like that: await store.find(database,
finder: Finder(
filter: Filter.and([
Filter.matches('product_name', 'Wasa'),
Filter.greaterThan('ecoscore', 'b'),
]))); We couldn't do that easily in SQLite, as we would have to scan the whole product database, decode Let's talk about the second use case: history, favorites and custom lists of products. Both Sembast and SQLite (and even It was not my initial choice, but from what I read about Sembast I think it would be perfect for both use cases. Later, we may find issues with the performances (too much memory involved, too slow queries, too slow app startup), but for the moment I think we can stay with one database system, named Sembast. |
Thank you for the detailed review. The JSON format and the query system were the reasons why we preferred Sembast over something like Hive which performances were superior but without any custom query possible. The main advantage of storing the data in a JSON format is the ability to use the classes and functions from openfoodfacts-dart, querying products from the local database is done in the exact same way as querying products from the API. |
Regarding the order of lists: I don't really see the use case for letting users re-order manually the lists in a custom order (as in dragging and dropping a list item at a specific place). But there is a big use case to let them display their lists in different orders (e.g. by date of adding the product to the list, by number of days before the products will perish, by number of instances of the product they have consumed / they currently own, by their food preferences, or something else etc.). One use case to think of: for a given product, we need to know in which lists it belongs to (if any). e.g. you scan a product, and the app tells you it's already in the list "Products in my fridge", or that you scanned it one month ago. Those use cases can be implemented in many different ways of course. |
@stephanegigandet OK, we won't reorder the lists. I agree with you: those use cases can be implemented with different way of thinking or database system. @PrimaelQuemerais Performances are already flagged as problematic with Sembast, then. My main concern with Sembast is this: "The whole document based database resides in a single file and is loaded in memory when opened." (found here). That means the bigger the database, the slower the database opening and the higher the level of memory used. It stinks. With SQLite, the startup is "almost" immediate : "Whenever a database is opened, the entire schema is scanned and parsed and a parse tree for the schema is held in memory. That means that database connection startup time and initial memory usage is proportional to the size of the schema." (the schema meaning the metadata about the tables) (found here). That means the database size has no impact on the duration of the database opening and the amount of memory used. From what I read so far, smoothie's only concern is "I give you a barcode, you give me the JSON". Not something more elaborated like I'm ready to run a test between Sembast and SQLite:
Do you agree with that test, guys? |
@monsieurtanuki Just curious, with SQLite, do we have to pre-declare the maximum size of the JSON when the table is created (e.g. something like VARCHAR(4096), or is that handled automagically? One thing that would be very useful to test as well: what if we have 10k products, 100k products, or even 1M products. e.g. for offline use, we could do something similar to OsmAnd where you can download maps for offline use for the countries you want. We could for instance have downlable archives of the top 10k or 100k most popular products. Or as you suggested, more specific datasets like all vegetarian products or all products sold in a specific store. |
@stephanegigandet SQLite does not care about variable types, actually. Could be a number, could be a string (but no specified size). And there's no limit - actually there's one limit: 1 billion bytes. We'll have a problem with VERY fat foods ;) There shouldn't be a problem with the number of products in SQLite (our main concern will be with the primary key index's size). PS: funny you mention OsmAnd as I contribute to osmdroid. |
Guys, we're lucky. I've got a 6-year-old canary-in-the-coalmine smartphone. My first test crashed with Sembast: I couldn't reopen my 8K product database. While the SQLite was pretty cool. I'll give you more details tomorrow, but to me it's a NO GO for Sembast. |
Once the functions to store and retrieve a Product from the database are written, Sembast offers little to no advantage over SQLite, except that each time we update the Product class the functions need to be updated aswell, which wasn't the case with Sembast. The benchmark seems a little bit unfair to me as Sembast would allow to query elements by looking at nested fields in the product, which would not be possible if the data is stored in SQLite in a single field. That being said the chose comes down to which features we actually need. Waiting for the benchmark details :) |
Thanks for all the tests!
I wonder if those 12 seconds are mostly due to the json.encode. What if we want to insert 10k products (from a downloadable file with the most popular products for instance), where each product data is available as a json string. Could you check how long it would take? |
@stephanegigandet I relaunched the same SQLite test: this time loading 1K product took around 8s. And when I used I think the database insert performances are not as important as the database opening:
For the record, what a product look like, data-wise (10Kb when you remove the spaces){
"product_name": "Salmón ahumado",
"lang": "en",
"brands": "Ocean Sea",
"serving_size": "50g",
"nutrition_grade_fr": "d",
"code": "20034658",
"selected_images": {
"front": {
"thumb": {
"fr": "https://static.openfoodfacts.org/images/products/20034658/front_fr.98.100.jpg"
},
"display": {
"fr": "https://static.openfoodfacts.org/images/products/20034658/front_fr.98.400.jpg"
},
"small": {
"fr": "https://static.openfoodfacts.org/images/products/20034658/front_fr.98.200.jpg"
}
},
"ingredients": {
"small": {
"es": "https://static.openfoodfacts.org/images/products/20034658/ingredients_es.78.200.jpg",
"fr": "https://static.openfoodfacts.org/images/products/20034658/ingredients_fr.83.200.jpg"
},
"thumb": {
"es": "https://static.openfoodfacts.org/images/products/20034658/ingredients_es.78.100.jpg",
"fr": "https://static.openfoodfacts.org/images/products/20034658/ingredients_fr.83.100.jpg"
},
"display": {
"es": "https://static.openfoodfacts.org/images/products/20034658/ingredients_es.78.400.jpg",
"fr": "https://static.openfoodfacts.org/images/products/20034658/ingredients_fr.83.400.jpg"
}
},
"nutrition": {
"display": {
"fr": "https://static.openfoodfacts.org/images/products/20034658/nutrition_fr.93.400.jpg"
},
"thumb": {
"fr": "https://static.openfoodfacts.org/images/products/20034658/nutrition_fr.93.100.jpg"
},
"small": {
"fr": "https://static.openfoodfacts.org/images/products/20034658/nutrition_fr.93.200.jpg"
}
}
},
"nutriments": {
"fat_100g": 10,
"sodium": 1.28,
"energy_100g": 728,
"saturated-fat_100g": 1.5,
"nova-group_100g": 3,
"sugars_unit": "g",
"sugars": 0,
"fiber_serving": 0,
"fat_unit": "g",
"fat": 10,
"proteins": 21,
"sodium_unit": "g",
"proteins_value": 21,
"salt_serving": 1.6,
"saturated-fat_value": 1.5,
"energy_unit": "kcal",
"nutrition-score-fr_100g": 13,
"nutrition-score-fr": 13,
"energy-kcal": 174,
"energy_serving": 364,
"energy": 728,
"nova-group": 3,
"energy-kcal_value": 174,
"fiber_unit": "g",
"sodium_100g": 1.28,
"carbohydrates_value": 0,
"saturated-fat_serving": 0.75,
"sodium_value": 1.28,
"energy-kcal_100g": 174,
"salt_100g": 3.2,
"sugars_100g": 0,
"nova-group_serving": 3,
"proteins_serving": 10.5,
"salt_unit": "g",
"fat_serving": 5,
"energy_value": 174,
"fiber": 0,
"fiber_value": 0,
"fruits-vegetables-nuts-estimate-from-ingredients_100g": 0,
"salt": 3.2,
"carbohydrates_100g": 0,
"saturated-fat_unit": "g",
"salt_value": 3.2,
"carbon-footprint-from-known-ingredients_serving": 321,
"saturated-fat": 1.5,
"proteins_unit": "g",
"carbon-footprint-from-known-ingredients_product": 1280,
"energy-kcal_serving": 87,
"carbohydrates_serving": 0,
"fiber_100g": 0,
"carbohydrates_unit": "g",
"energy-kcal_unit": "kcal",
"proteins_100g": 21,
"fat_value": 10,
"carbohydrates": 0,
"sugars_serving": 0,
"sugars_value": 0,
"sodium_serving": 0.64
},
"quantity": "200gr",
"attribute_groups": [{
"attributes": [{
"id": "nutriscore",
"title": "Nutri-Score D",
"name": "Nutri-Score",
"description_short": "Poor nutritional quality",
"match": 32.5,
"icon_url": "https://static.openfoodfacts.org/images/misc/nutriscore-d.svg",
"description": "",
"status": "known"
}, {
"icon_url": "https://static.openfoodfacts.org/images/icons/nutrient-level-salt-high.svg",
"match": 0,
"description_short": "3.2 g / 100 g",
"status": "known",
"id": "low_salt",
"name": "Salt",
"title": "Salt in high quantity"
}, {
"name": "Fat",
"title": "Fat in moderate quantity",
"id": "low_fat",
"status": "known",
"icon_url": "https://static.openfoodfacts.org/images/icons/nutrient-level-fat-medium.svg",
"description_short": "10 g / 100 g",
"match": 55.2941176470588
}, {
"title": "Sugars in low quantity",
"name": "Sugars",
"id": "low_sugars",
"status": "known",
"match": 100,
"description_short": "0 g / 100 g",
"icon_url": "https://static.openfoodfacts.org/images/icons/nutrient-level-sugars-low.svg"
}, {
"status": "known",
"icon_url": "https://static.openfoodfacts.org/images/icons/nutrient-level-saturated-fat-medium.svg",
"match": 80,
"description_short": "1.5 g / 100 g",
"title": "Saturated fat in moderate quantity",
"name": "Saturated fat",
"id": "low_saturated_fat"
}],
"id": "nutritional_quality",
"name": "Nutritional quality"
}, {
"id": "allergens",
"attributes": [{
"title": "Does not contain: Milk",
"name": "Milk",
"id": "allergens_no_milk",
"debug": "3 ingredients (0 unknown)",
"status": "known",
"icon_url": "https://static.openfoodfacts.org/images/icons/no-milk.svg",
"match": 100
}, {
"status": "known",
"icon_url": "https://static.openfoodfacts.org/images/icons/no-gluten.svg",
"match": 100,
"name": "Gluten",
"title": "Does not contain: Gluten",
"id": "allergens_no_gluten",
"debug": "3 ingredients (0 unknown)"
}, {
"status": "known",
"match": 100,
"icon_url": "https://static.openfoodfacts.org/images/icons/no-celery.svg",
"title": "Does not contain: Celery",
"name": "Celery",
"id": "allergens_no_celery",
"debug": "3 ingredients (0 unknown)"
}, {
"title": "Does not contain: Soybeans",
"name": "Soybeans",
"debug": "3 ingredients (0 unknown)",
"id": "allergens_no_soybeans",
"status": "known",
"icon_url": "https://static.openfoodfacts.org/images/icons/no-soybeans.svg",
"match": 100
}, {
"status": "known",
"icon_url": "https://static.openfoodfacts.org/images/icons/no-mustard.svg",
"match": 100,
"name": "Mustard",
"title": "Does not contain: Mustard",
"debug": "3 ingredients (0 unknown)",
"id": "allergens_no_mustard"
}, {
"icon_url": "https://static.openfoodfacts.org/images/icons/no-none.svg",
"match": 100,
"status": "known",
"debug": "3 ingredients (0 unknown)",
"id": "allergens_no_none",
"title": "Does not contain: None",
"name": "None"
}, {
"name": "Fish",
"title": "Does not contain: Fish",
"id": "allergens_no_fish",
"debug": "3 ingredients (0 unknown)",
"status": "known",
"icon_url": "https://static.openfoodfacts.org/images/icons/no-fish.svg",
"match": 100
}, {
"name": "Molluscs",
"title": "Does not contain: Molluscs",
"debug": "3 ingredients (0 unknown)",
"id": "allergens_no_molluscs",
"status": "known",
"icon_url": "https://static.openfoodfacts.org/images/icons/no-molluscs.svg",
"match": 100
}, {
"debug": "3 ingredients (0 unknown)",
"id": "allergens_no_crustaceans",
"title": "Does not contain: Crustaceans",
"name": "Crustaceans",
"icon_url": "https://static.openfoodfacts.org/images/icons/no-crustaceans.svg",
"match": 100,
"status": "known"
}, {
"debug": "3 ingredients (0 unknown)",
"id": "allergens_no_sesame_seeds",
"title": "Does not contain: Sesame seeds",
"name": "Sesame seeds",
"match": 100,
"icon_url": "https://static.openfoodfacts.org/images/icons/no-sesame-seeds.svg",
"status": "known"
}, {
"match": 100,
"icon_url": "https://static.openfoodfacts.org/images/icons/no-peanuts.svg",
"status": "known",
"debug": "3 ingredients (0 unknown)",
"id": "allergens_no_peanuts",
"title": "Does not contain: Peanuts",
"name": "Peanuts"
}, {
"status": "known",
"icon_url": "https://static.openfoodfacts.org/images/icons/no-lupin.svg",
"match": 100,
"name": "Lupin",
"title": "Does not contain: Lupin",
"debug": "3 ingredients (0 unknown)",
"id": "allergens_no_lupin"
}, {
"icon_url": "https://static.openfoodfacts.org/images/icons/no-sulphur-dioxide-and-sulphites.svg",
"match": 100,
"status": "known",
"debug": "3 ingredients (0 unknown)",
"id": "allergens_no_sulphur_dioxide_and_sulphites",
"name": "Sulphur dioxide and sulphites",
"title": "Does not contain: Sulphur dioxide and sulphites"
}, {
"title": "Does not contain: Nuts",
"name": "Nuts",
"debug": "3 ingredients (0 unknown)",
"id": "allergens_no_nuts",
"status": "known",
"match": 100,
"icon_url": "https://static.openfoodfacts.org/images/icons/no-nuts.svg"
}, {
"icon_url": "https://static.openfoodfacts.org/images/icons/no-eggs.svg",
"match": 100,
"status": "known",
"debug": "3 ingredients (0 unknown)",
"id": "allergens_no_eggs",
"title": "Does not contain: Eggs",
"name": "Eggs"
}],
"warning": "There is always a possibility that data about allergens may be missing, incomplete, incorrect or that the product's composition has changed. If you are allergic, always check the information on the actual product packaging.",
"name": "Allergens"
}, {
"name": "Ingredients",
"id": "ingredients_analysis",
"attributes": [{
"title": "Non-vegan",
"status": "known",
"name": "Vegan",
"match": 0,
"id": "vegan",
"icon_url": "https://static.openfoodfacts.org/images/icons/non-vegan.svg"
}, {
"match": 0,
"icon_url": "https://static.openfoodfacts.org/images/icons/non-vegetarian.svg",
"id": "vegetarian",
"title": "Non-vegetarian",
"status": "known",
"name": "Vegetarian"
}, {
"status": "known",
"title": "Palm oil free",
"id": "palm-oil-free",
"icon_url": "https://static.openfoodfacts.org/images/icons/palm-oil-free.svg",
"match": 100
}]
}, {
"name": "Food processing",
"id": "processing",
"attributes": [{
"match": 50,
"description_short": "Processed foods",
"icon_url": "https://static.openfoodfacts.org/images/misc/nova-group-3.svg",
"description": "",
"status": "known",
"id": "nova",
"title": "NOVA 3",
"name": "NOVA group"
}]
}, {
"name": "",
"id": "ingredients",
"attributes": [{
"title": "Without additives",
"status": "known",
"name": "Additives",
"match": 100,
"id": "additives",
"icon_url": "https://static.openfoodfacts.org/images/icons/0-additives.svg"
}]
}, {
"id": "environment",
"attributes": [{
"status": "known",
"match": 28.6396590441916,
"description_short": "High environmental impact",
"icon_url": "https://static.openfoodfacts.org/images/icons/ecoscore-d.svg",
"description": "",
"title": "Eco-Score D",
"name": "Eco-Score",
"id": "ecoscore"
}, {
"icon_url": "https://static.openfoodfacts.org/images/icons/forest-footprint-not-computed.svg",
"description": "",
"description_short": "Currently only for products with chicken or eggs",
"match": 0,
"status": "known",
"id": "forest_footprint",
"title": "Forest footprint not computed",
"name": "Forest footprint"
}],
"name": "Environment"
}, {
"name": "Labels",
"id": "labels",
"attributes": [{
"name": "Organic farming",
"title": "Not an organic product",
"id": "labels_organic",
"status": "known",
"description": "Organic farming aims to protect the environment and to conserve biodiversity by prohibiting or limiting the use of synthetic fertilizers, pesticides and food additives.",
"icon_url": "https://static.openfoodfacts.org/images/icons/not-organic.svg",
"description_short": "Organic products promote ecological sustainability and biodiversity.",
"match": 0
}, {
"match": 0,
"description_short": "Fair trade products help producers in developping countries.",
"description": "When you buy fair trade products, producers in developing countries are paid an higher and fairer price, which helps them improve and sustain higher social and often environmental standards.",
"icon_url": "https://static.openfoodfacts.org/images/icons/not-fair-trade.svg",
"status": "known",
"id": "labels_fair_trade",
"name": "Fair trade",
"title": "Not a fair trade product"
}]
}],
"categories_tags": ["en:seafood", "en:fishes", "en:salmons", "en:smoked-fishes", "en:smoked-salmons", "en:smoked-salmons-from-farming"],
"labels_tags": [],
"additives_tags": [],
"environment_impact_level_tags": [],
"product_quantity": "200",
"nutrient_levels": {
"fat": "moderate",
"salt": "high",
"sugars": "low",
"saturated-fat": "moderate"
},
"ingredients_analysis_tags": ["en:palm-oil-free", "en:non-vegan", "en:non-vegetarian"],
"image_small_url": "https://static.openfoodfacts.org/images/products/20034658/front.4.200.jpg"
} |
Thanks for all the tests. I agree SQLite seems to be the best option for our needs. |
@stephanegigandet Great! Let's start with SQLite then. I'll send a PR within a week. We won't be stuck with SQLite forever anyway: as far as most developers are concerned, they will call a black-box that returns a product stored locally, from its barcode. Regardless of the type of database behind. If for some very good reason we feel appropriate in the future to switch to another type of database, nothing should stop us and it will be transparent for the developers. More specifically about the new SQLite database, we'll create a "table" in it (that's the relational database terminology), that will contain all the (e.g. downloaded) products. As we already saw, the extraction of the JSON part is more or less painful.
|
At this point, I'm more thinking of the SQLite database as a big cache of product data, that we access only as a key (code) / value (json) cache. Whenever we need data for a specific product (that we scan, or that we have in a list like the history of scanned products), we look for the corresponding codes through the database (and through the API if we have some network). Those lists will probably be reasonably small (less than 1000 products), so if we want to filter them, we can do it outside the database. If we wanted to support a full offline search / exploration (like the current "category browser" that is in the app), then we could have separate fields for categories etc. but at this point this feature is very experimental and very likely to change. Let's do it through the OFF API first, and if there's really a need to make it available offline (which is less and less of a need as connectivity gets more ubiquitous), we can see that later. We could consider adding a last access time (so that we can expire entries that are never accessed), but then that would mean updating the entries whenever we access them. Not sure if it's worth it. A "source" field would be useful, so that we know if the entry was added through a scan, a search, a downloaded / initial database etc. Maybe also a field to indicate whether the product belongs to at least one list. |
Ok, I'll stay with my 3 columns then for SQLite table create table product(
barcode TEXT PRIMARY KEY,
encoded_json TEXT NOT NULL,
last_upsert INT NOT NULL
) And about additional fields like "source", it's not related to a product, it's related to a product list:
|
I think it is more logical to make a list out of the last one. |
…get's New files: * `barcode_product_query.dart`: API query / product by barcode (used to be in `full_products_database.dart`) * `local_database.dart`: local SQLite database with one table (`product`) Deleted files: * `choose_page_model.dart`: moved the code to `StatefulWidget` `ChoosePage` * `full_product_database.dart`: database code was deprecated; API fields is now in `ProductQuery` Impacted files: * `alternative_continuous_scan_page.dart`: now a `StatefulWidget`; refactored with `ContinuousScanPage` * `build.gradle`: unrelated - lowered the Android `minSdkVersion` from `24` to `19`, in order to run the app on my old smartphone ;) * `choose_page.dart`: refactoring as changed to `StatefulWidget` * `continuous_scan_model.dart`: not a `ChangeNotifier` anymore - should be improved when we use product list * `continuous_scan_page.dart`: now a `StatefulWidget` * `group_product_query.dart`: minor refactoring due to the changes in `product_query.dart` * `keywords_product_query.dart`: minor refactoring due to the changes in `product_query.dart` * `main.dart`: added `LocalDatabase` to the providers; simplified the code; removed mentions to `SharedPreferences` * `product_query.dart`: refactored a bit; moved here some fields from `full_product_database.dart` * `product_query_model.dart`: now using `LocalDatabase` * `product_query_page.dart`: now using `LocalDatabase`; refactoring * `pubspec.yaml`: replaced `sembast with `sqflite`
…el without Widgets Impacted files: * `alternative_continuous_scan_page.dart`: refactoring mainly due to changes in `ContinuousScanModel` * `continuous_scan_model.dart`: now we don't deal with `Widget`s, only metadata, and we use a NotifyListeners method * `continuous_scan_page.dart`: refactoring mainly due to changes in `ContinuousScanModel` * `contribution_page.dart`: unrelated refactoring about the use of `UserPreferences` * `local_database.dart`: added a temporary "dummy"`NotifyListeners` method, for the sake of demonstration - to be removed * `smooth_product_carousel.dart`: now we build the `Widget`s from metadata, and we control here the carousel
I don't know if this has already been said somewhere, but I wanted to mention that scanning no longer works for me in the current master.
|
This error comes from the openfoodfacts-dart plugin, will investigate |
@PrimaelQuemerais I think it's fixed in openfoodfacts/openfoodfacts-dart#65, where the |
Great! I will take a look at the PR. |
feature/#55 - SQLite local database, and refactoring with StatefulWidget's
For the moment, user preferences are stored in
SharedPreferences
, which is appropriate, and products are stored in a Sembast database in full_products_database.dart.I'm not 100% convinced by Sembast, my personal taste is rather SQLite (with a bit of JSON in it), but this can be challenged, of course.
My concern is rather what we do with the database.
For the moment, not that much: we populate the database when we download products, but we never query the database.
I think it would be interesting to actually use the database:
The text was updated successfully, but these errors were encountered: