Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NLP pipeline validation with CI tests #22

Merged
merged 16 commits into from
Dec 7, 2023

Conversation

fmigneault
Copy link
Contributor

changes

  • add JSON typings
  • align JSON formatting across files
  • fix nlp imports
  • add GitHub CI tests workflow

@fmigneault fmigneault self-assigned this Dec 6, 2023
@fmigneault fmigneault marked this pull request as ready for review December 6, 2023 23:08
@fmigneault
Copy link
Contributor Author

@TimeaBagosiCrim
Imports are now validated and working.
However, one test seems to fail:
tests/test_MetricsClasses.py::MetricsClassesTests::test_val - AssertionError: 1.0 != 0
https://github.com/crim-ca/pavics-jupyter-images/actions/runs/7121350381/job/19390424450?pr=22

Do you know what could be the cause?

@fmigneault
Copy link
Contributor Author

@TimeaBagosiCrim

The error seems to be introduced by this step:

for value_type in VALUE_TYPES:
value_measures.get_value_metrics(value_type).perfect_value_match = \
value_measures.get_value_metrics(value_type).perfect_value_match \
/ value_measures.get_value_metrics(value_type).total_matching_attributes \
if value_measures.get_value_metrics(value_type).total_matching_attributes > 0 else 0

Before running this loop, value_measures.get_value_metrics("numeric").perfect_value_match returns 1.0 as expected for the first annotation with "value":

{
"text": "cloud cover lower than 10%",
"position": [57, 83],
"type": "property",
"name": "cloud cover",
"value": 10,
"value_type": "percentage",
"operation": "lt"
}

The total_matching_attributes is always zero because this doesn't succeed (actual int is in value, not a numeric str) :

if 'value' in ann.keys() and isinstance(ann['value'], str):
# print(ann['value'])
if isnumeric(ann['value']):

This lead to resetting perfect_value_match to zero each time for "numeric" because of the loop.

Is it a problem to adjust the logic to allow int/float values as well, or will that break other code somewhere else that expects str only?

@fmigneault fmigneault merged commit b190ca2 into DAC-524-baseline-V2 Dec 7, 2023
9 checks passed
@fmigneault fmigneault deleted the baseline-v2-tests branch December 7, 2023 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant