[FIX] Fix label mismatch of evaluation and validation with large dataset in semantic segmentation #1851

supersoob · 2023-03-03T05:01:55Z

Issue
(1) 'otx eval' (inference step & final evaluation score after training) shows us very low score for models trained with dataset with multiple classes(>10)
(2) The order of classes written in validation log is wrong with dataset with multiple classes(>10)
(3) [TODO] Semantic segmentation always produces lower evaluation score compared to the best validation score -> (Found solution but need some discussion)

Root Causes
(1)(2) : When converting segmentation mask to otx 2d numpy array(mapping to class index), it str-sorts the labels with id key.

training_extensions/otx/api/utils/segmentation_utils.py

Line 70 in 6cba8c9

labels = sorted(labels) # type: ignore

Due to this, when num_classes is above 10, the order is newly made like this. That caused mismatch with unsorted label schema.json which is saved in order of initial dataset_meta.json. Plus, it caused mismatch in actual prediction order from model head output.

(3) : Two factors in this issue. Background ignored and soft_threshold. The final evaluation score does not include the background label score, which never be the same with the best validation score. soft_threshold is set to 0.5 in default. It ignores the max score in prediction below the soft_threshold. But in eval hook(validation), soft_threshold is not considered which is same as threshold 0.0.

Solution
(1) : label dictionary are sorted before evaluation in inference.py rather than removing the sorted(labels) in mask_from_annnotation because anomalib also uses this func and can be affected to Geti which generates random bytes id.
(2) : project_labels are sorted before converting gt mask to otx mask and self.CLASSES are realigned with sorted one.
(3) : To be updated in other PR (need to discuss)

Checklists

[WIP] Unit test code
[WIP] CLI Test

sungmanc · 2023-03-03T07:34:54Z

@ashwinvaidya17 , could you double check the effect of mask_from_annnotation function ? As Soobee said, the sort logic looks weird.

training_extensions/otx/api/utils/segmentation_utils.py

Line 50 in 6cba8c9

def mask_from_annotation(

@sungchul2 , maybe you also have knowledge about the segmentation issue, do you have any idea or comments?

otx/algorithms/segmentation/adapters/mmseg/data/dataset.py

sungmanc

Thanks for the nice work. BTW, unit-test checking is needed

sungchul2

LGTM.
But it would be better to handle it in mask_from_annnotation if there is a way to solve it in mask_from_annnotation.

kprokofi · 2023-03-03T10:46:54Z

Unfortunately, I tested your branch and nothing changed
otx eval outputs much lower score than expected
Many classes aren't predicted at all during training:

otx eval:

I'm looking now also in that problem, but I think sorting classes didn't solve this issue

kprokofi · 2023-03-03T13:31:17Z

during otx eval it seems like there is no labels in dataset at all:

supersoob · 2023-03-05T13:04:50Z

Unfortunately, I tested your branch and nothing changed otx eval outputs much lower score than expected Many classes aren't predicted at all during training: otx eval:

I'm looking now also in that problem, but I think sorting classes didn't solve this issue

@kprokofi Hmm.. My VOC dataset is working well. Could you share your model checkpoint and validation dataset? FYI, this will be completely solved when my follow-up PR is merged (please see (3) in this PR description -> bg label is not still included and soft threshold is 0.5 in default)

supersoob · 2023-03-05T13:31:14Z

during otx eval it seems like there is no labels in dataset at all:

And it's natural that MPASegDataset for test dataset does not include any labels except for bg because otx eval cli passes validation dataset to infer function with_empty_annotations(). Score measurement is conducted with dataset with actual annotations in otx task side not in mpa.

codecov-commenter · 2023-03-06T07:14:46Z

Codecov Report

Patch coverage: 100.00% and project coverage change: -0.02 ⚠️

Comparison is base (4f1a47c) 80.52% compared to head (509d392) 80.51%.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #1851      +/-   ##
===========================================
- Coverage    80.52%   80.51%   -0.02%     
===========================================
  Files          477      477              
  Lines        32813    32834      +21     
===========================================
+ Hits         26423    26435      +12     
- Misses        6390     6399       +9

Impacted Files	Coverage Δ
otx/api/usecases/evaluation/basic_operations.py	`98.30% <ø> (ø)`
...rithms/segmentation/adapters/mmseg/data/dataset.py	`88.88% <100.00%> (+0.29%)`	⬆️
otx/algorithms/segmentation/tasks/inference.py	`87.82% <100.00%> (ø)`
otx/cli/tools/find.py	`86.44% <0.00%> (-4.64%)`	⬇️
otx/cli/tools/build.py	`90.24% <0.00%> (-2.26%)`	⬇️
...hms/detection/adapters/mmdet/utils/config_utils.py	`93.29% <0.00%> (-1.22%)`	⬇️
.../api/usecases/exportable_code/streamer/streamer.py	`89.40% <0.00%> (-0.67%)`	⬇️
otx/cli/manager/config_manager.py	`84.98% <0.00%> (-0.28%)`	⬇️
otx/algorithms/anomaly/tasks/inference.py	`82.51% <0.00%> (-0.27%)`	⬇️
otx/algorithms/common/tasks/training_base.py	`50.57% <0.00%> (-0.15%)`	⬇️
... and 8 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

kprokofi · 2023-03-06T17:04:51Z

@supersoob thank you for this update. Testing cityscapes I found out mismatch in labels caused by:

There is no need to shift labels in that case
Could you please remove it? (I mean this "k+1") - simple delete this line.
training_extensions/otx/core/data/adapter/segmentation_dataset_adapter.py

sungmanc · 2023-03-07T00:45:07Z

@supersoob thank you for this update. Testing cityscapes I found out mismatch in labels caused by: There is no need to shift labels in that case Could you please remove it? (I mean this "k+1") - simple delete this line. training_extensions/otx/core/data/adapter/segmentation_dataset_adapter.py

Currently, Cityscapes dataset is not supported due to the background label issue.

supersoob · 2023-03-07T00:52:22Z

@supersoob thank you for this update. Testing cityscapes I found out mismatch in labels caused by: There is no need to shift labels in that case Could you please remove it? (I mean this "k+1") - simple delete this line. training_extensions/otx/core/data/adapter/segmentation_dataset_adapter.py

Thank you for testing with various dataset. But unfortunately, every background insertion in otx is root cause for running cityscapes. Even if I shift the label, cityscape won't work as long as I don't remove all bg label in otx which is a very huge job. And that solution affects to voc and other dataset with bg. I plan to stabilize voc first and then look at cityscapes.. I hope to merge this PR if you don't have performance problem with voc and custom seg with >10 labels(with backgrounds)

kprokofi

Let's merge it now! Thank you for solving labels mismatch problem

Lee, Soobee added 6 commits March 3, 2023 08:49

fix label order to align model label schema

1990fd1

fix label order to align model label schema

1275784

fix validation log

da78587

fix validation log

1632307

consider ignore label

2b32ca5

annotation update

6b2f7bf

supersoob requested a review from a team as a code owner March 3, 2023 05:01

github-actions bot added ALGO Any changes in OTX Algo Tasks implementation API Any changes in OTX API labels Mar 3, 2023

supersoob requested review from JihwanEom and kprokofi March 3, 2023 05:03

harimkang reviewed Mar 3, 2023

View reviewed changes

otx/algorithms/segmentation/adapters/mmseg/data/dataset.py Outdated Show resolved Hide resolved

sungmanc previously approved these changes Mar 3, 2023

View reviewed changes

jaegukhyun added this to the 1.1.0 milestone Mar 3, 2023

sungchul2 previously approved these changes Mar 3, 2023

View reviewed changes

Lee, Soobee added 2 commits March 3, 2023 21:02

fithreshold

d4ba2ca

resolve the label order issue in otx eval

f8b5e8b

supersoob dismissed stale reviews from sungchul2 and sungmanc via 509d392 March 6, 2023 06:47

github-actions bot added the TEST Any changes in tests label Mar 6, 2023

fix and add unit test

509d392

jaegukhyun approved these changes Mar 7, 2023

View reviewed changes

kprokofi approved these changes Mar 7, 2023

View reviewed changes

kprokofi merged commit 872f119 into develop Mar 7, 2023

kprokofi deleted the soobee/fix-eval-seg branch March 7, 2023 09:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FIX] Fix label mismatch of evaluation and validation with large dataset in semantic segmentation #1851

[FIX] Fix label mismatch of evaluation and validation with large dataset in semantic segmentation #1851

supersoob commented Mar 3, 2023 •

edited

Loading

sungmanc commented Mar 3, 2023

sungmanc left a comment •

edited

Loading

sungchul2 left a comment

kprokofi commented Mar 3, 2023

kprokofi commented Mar 3, 2023

supersoob commented Mar 5, 2023 •

edited

Loading

supersoob commented Mar 5, 2023 •

edited

Loading

codecov-commenter commented Mar 6, 2023

kprokofi commented Mar 6, 2023 •

edited

Loading

sungmanc commented Mar 7, 2023

supersoob commented Mar 7, 2023 •

edited

Loading

kprokofi left a comment

[FIX] Fix label mismatch of evaluation and validation with large dataset in semantic segmentation #1851

[FIX] Fix label mismatch of evaluation and validation with large dataset in semantic segmentation #1851

Conversation

supersoob commented Mar 3, 2023 • edited Loading

sungmanc commented Mar 3, 2023

sungmanc left a comment • edited Loading

Choose a reason for hiding this comment

sungchul2 left a comment

Choose a reason for hiding this comment

kprokofi commented Mar 3, 2023

kprokofi commented Mar 3, 2023

supersoob commented Mar 5, 2023 • edited Loading

supersoob commented Mar 5, 2023 • edited Loading

codecov-commenter commented Mar 6, 2023

Codecov Report

kprokofi commented Mar 6, 2023 • edited Loading

sungmanc commented Mar 7, 2023

supersoob commented Mar 7, 2023 • edited Loading

kprokofi left a comment

Choose a reason for hiding this comment

supersoob commented Mar 3, 2023 •

edited

Loading

sungmanc left a comment •

edited

Loading

supersoob commented Mar 5, 2023 •

edited

Loading

supersoob commented Mar 5, 2023 •

edited

Loading

kprokofi commented Mar 6, 2023 •

edited

Loading

supersoob commented Mar 7, 2023 •

edited

Loading