Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add data quality warning for serving size without digit #8057

Conversation

benbenben2
Copy link
Collaborator

@benbenben2 benbenben2 commented Jan 31, 2023

Tested on Gitpod

Related to:

Questions (and answers):

  1. should it involve nutrition_data_prepared as well?
    -> decision: yes
  2. raise quality error tag - only when nutrition_data_per or nutrition_data_prepared_per are "on" (option 1) - vs - if string without digit value is in serving_size (even if nutrition_data_per or nutrition_data_prepared_per are missing) (option 2)
    -> decision: all (option 2) because as stated by @CharlesNepote "when users are often seeing this value they can think it's a good value" (Quality facet - Nutrition - serving_size="serving" for data-quality #5163)
  3. how come serving_quantity=0 when it is "serving" or any string?
    -> in lib/ProductOpener/Food.pm, the subroutine normalize_serving_size extracts any digit and return it or 0 if nothing is found.
    3.1. can this be used to setup the error?
    -> it cannot be serving_quantity (using normalize_serving_size) because serving_quantity is undefined if the value is something like "serving 1 and serving 2 and serving 3", which contain digit
  4. what name to give to the error?
    -> existing one is "en:serving-size-in-mg"
    -> plural was used for "EcoScore - threatened species - Ingredients missing"
    hence, "en:serving-size-is-missing-digits" seems to be a good choice

@benbenben2 benbenben2 added ✨ Feature Features or enhancements to Open Food Facts server 🧽 Data quality https://wiki.openfoodfacts.org/Quality 🧽 Data quality - Measure - Quality facets One of the facets available in Open Food Facts is /quality & allows us to spot products w/ bad data labels Jan 31, 2023
@benbenben2 benbenben2 requested a review from a team as a code owner January 31, 2023 21:40
@benbenben2 benbenben2 self-assigned this Jan 31, 2023
@codecov-commenter
Copy link

codecov-commenter commented Jan 31, 2023

Codecov Report

Merging #8057 (b4631e9) into main (a67350e) will increase coverage by 0.02%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main    #8057      +/-   ##
==========================================
+ Coverage   44.70%   44.72%   +0.02%     
==========================================
  Files         100      100              
  Lines       20212    20220       +8     
  Branches     4636     4637       +1     
==========================================
+ Hits         9035     9043       +8     
  Misses      10103    10103              
  Partials     1074     1074              
Impacted Files Coverage Δ
lib/ProductOpener/DataQualityFood.pm 56.65% <100.00%> (+0.24%) ⬆️
tests/unit/dataqualityfood.t 76.22% <100.00%> (+1.22%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@benbenben2
Copy link
Collaborator Author

Thank you @stephanegigandet
Your suggested code is working fine (tested with gitpod).
But for perltidy, no idea why it is failing (tried changing formatting, it is working with only 2 conditions but not with 3) :-(

@stephanegigandet
Copy link
Contributor

But for perltidy, no idea why it is failing (tried changing formatting, it is working with only 2 conditions but not with 3) :-(

Hi @benbenben2, you need to run "make lint", and perltidy will change the indentation, split long lines etc.

@sonarcloud
Copy link

sonarcloud bot commented Feb 3, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

Copy link
Contributor

@stephanegigandet stephanegigandet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@stephanegigandet stephanegigandet merged commit aa9404d into main Feb 6, 2023
@stephanegigandet stephanegigandet deleted the feat_add_data_quality_warning_serving_size_without_digit branch February 6, 2023 08:46
@CharlesNepote
Copy link
Member

Didn't have time to thank you, it's great!

CharlesNepote added a commit that referenced this pull request Feb 10, 2023
teolemon added a commit that referenced this pull request Feb 11, 2023
* Add en:Serving size is missing digits

See #5163 and #8057.

* ci: autolabel changes to the taxo

---------

Co-authored-by: Pierre Slamich <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🧽 Data quality - Measure - Quality facets One of the facets available in Open Food Facts is /quality & allows us to spot products w/ bad data 🧽 Data quality https://wiki.openfoodfacts.org/Quality ✨ Feature Features or enhancements to Open Food Facts server 🧪 tests
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants