You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A model is trained on a labeled training dataset and evaluated on a labeled evaluation set.
Then the trained model makes predictions across all the data for an area of interest (AOI).
Is the cropland map good?
A model trained on training dataset A is presumed to perform well on evaluation dataset B if A and B have a similar distribution.
A trained model that performs well on evaluation dataset B is presumed to make a good map for area of interest C if B and C have a similar distribution.
Issue 1: Understanding performance on evaluation dataset:
Currently we evaluate a trained model by measuring the f1-score over an evaluation dataset B.
The metric helps us understand how well the model predicts crops, however it does not tell us a lot about what sort of errors the model may be making.
Issue 2: Performance of evaluation dataset translating to map quality
The score on the evaluation dataset B only matters if the distribution of B is similar to the area of interest C.
Currently we:
Create evaluation dataset B by randomly sampling inside the area of interest C
Record the crop percentage in the sampled dataset B
Record the labeler disagreement in the sampled dataset B
These points help shed some light onto the similarity between B and C and thereby translation of metric to map quality. However is it possible to have more confidence about a good metric translating to high map quality?
Potential Solution:
We can use agro-ecological zones to 1) better understand performance on the evaluation dataset, and 2) better understand performance translating to map quality by measuring model performance on each agro-ecological zone represented inside the evaluation dataset.
From FAO:
An Agro-ecological Zone is a land resource mapping unit, defined in terms of climate, landform and soils, and/or land cover, and having a specific range of potentials and constraints for land use.
Understanding performance on each zone will be especially relevant for areas of interest with many agro-ecological zones such as Uganda (#254)
This additional understanding will help inform how we gather future evaluation data, corrective labeling, and disclaimers that we can add to published cropland maps.
Potential implementation
1. Record the agro-ecological zone of each evaluation point when available.
This can be implemented by adding a dataset of agro-ecological zones for a particular region and using that dataset to determine the acroecological zone for every coordinate in a LabeledDatasetand generate an additional agro-ecological column:
2. Using the newly generated agro-ecological column to record the agro-ecological distribution for each dataset in data/reports.txt
This can be implemented by adding an additional line to compute the value_counts() for the agro-ecological column here:
3. Log a new per class agro-ecological accuracy to wandb in a confusion matrix to better understand how well each model is doing in each zone
This requires a little more nuance because pytorch-lightning takes responsibility for some of the metric recording, however the relevant lines of code will be here:
Sounds like a good study! Not sure if you already discussed who would work on this, but it could be a good-first-issue.
I think it would be good to circle back on the work that has been done on generating dataset reports and how we can make these easily accessible/usable (including the intercomparison reports).
I think it would be good to circle back on the work that has been done on generating dataset reports and how we can make these easily accessible/usable (including the intercomparison reports).
Agreed, I think it would be helpful to narrow down the target audience for this. In my view, the purpose of this issue is to help us (cropland producers) make better decisions around gathering future evaluation data, corrective labeling, and disclaimers that we can add to published cropland maps. Thereby the final deliverable of the potential solution is a wandb metric that'll be associated with a model and accessible for us.
Who would you say is the target audience for dataset reports? @hannah-rae
Context
To create a cropland map:
Is the cropland map good?
Issue 1: Understanding performance on evaluation dataset:
Currently we evaluate a trained model by measuring the f1-score over an evaluation dataset B.
The metric helps us understand how well the model predicts crops, however it does not tell us a lot about what sort of errors the model may be making.
Issue 2: Performance of evaluation dataset translating to map quality
The score on the evaluation dataset B only matters if the distribution of B is similar to the area of interest C.
Currently we:
These points help shed some light onto the similarity between B and C and thereby translation of metric to map quality. However is it possible to have more confidence about a good metric translating to high map quality?
Potential Solution:
We can use agro-ecological zones to 1) better understand performance on the evaluation dataset, and 2) better understand performance translating to map quality by measuring model performance on each agro-ecological zone represented inside the evaluation dataset.
From FAO:
Understanding performance on each zone will be especially relevant for areas of interest with many agro-ecological zones such as Uganda (#254)
This additional understanding will help inform how we gather future evaluation data, corrective labeling, and disclaimers that we can add to published cropland maps.
Potential implementation
1. Record the agro-ecological zone of each evaluation point when available.
This can be implemented by adding a dataset of agro-ecological zones for a particular region and using that dataset to determine the acroecological zone for every coordinate in a
LabeledDataset
and generate an additionalagro-ecological
column:crop-mask/datasets.py
Line 65 in 0cf29ff
2. Using the newly generated
agro-ecological
column to record the agro-ecological distribution for each dataset indata/reports.txt
This can be implemented by adding an additional line to compute the
value_counts()
for the agro-ecological column here:crop-mask/src/labeled_dataset_custom.py
Line 105 in 0cf29ff
3. Log a new per class agro-ecological accuracy to wandb in a confusion matrix to better understand how well each model is doing in each zone
This requires a little more nuance because
pytorch-lightning
takes responsibility for some of the metric recording, however the relevant lines of code will be here:crop-mask/src/pipeline_funcs.py
Line 100 in 0cf29ff
The text was updated successfully, but these errors were encountered: