Chore: Documentation and input / output cols #715

elboy3 · 2023-07-18T00:24:27Z

Small cleanup leading up to S2S sprint during onsite week

setu4993 · 2023-07-18T00:27:07Z

dataquality/loggers/model_logger/seq2seq.py

@@ -135,7 +154,7 @@ def _get_data_dict(self) -> Dict:
            C.epoch.value: [self.epoch] * batch_size,
        }
        if self.split == Split.inference:
-            data["inference_name"] = [self.inference_name] * batch_size
+            data[C.inference_name.value] = [self.inference_name] * batch_size


Do we need a .value here? I think it works simply, too?

Suggested change

data[C.inference_name.value] = [self.inference_name] * batch_size

data[C.inference_name] = [self.inference_name] * batch_size

Yes we do! This creates a dict that later gets cast to a vaex DataFrame, which can't have enum as the keys to the dict

The following errors:

from enum import Enum import vaex class MyType(str, Enum): a = "a" b = "b" c = "c" df = vaex.from_dict( { MyType.a: [1, 2, 3], MyType.b: [4, 5, 6], MyType.c: [7, 8, 9], } )

setu4993 · 2023-07-18T00:27:21Z

dataquality/schemas/seq2seq.py

    token_deps = "token_deps"
    token_gold_probs = "token_gold_probs"
    # Mypy complained about split as an attribute, so we use `split_`
    split_ = "split"
    epoch = "epoch"
+    inference_name = "inference_name"


Why inference_name?

For line 157 in dataquality/loggers/model_logger/seq2seq.py

My question was why call it inference name? Name as a column name is too generic and doesn't feel like that's what we should be getting?

codecov-commenter · 2023-07-18T00:33:13Z

Codecov Report

Merging #715 (3b8333c) into main (59cfd9a) will increase coverage by 0.00%.
The diff coverage is 90.00%.

@@           Coverage Diff           @@
##             main     #715   +/-   ##
=======================================
  Coverage   89.69%   89.70%           
=======================================
  Files         166      166           
  Lines       13320    13323    +3     
=======================================
+ Hits        11948    11951    +3     
  Misses       1372     1372

Impacted Files	Coverage Δ
dataquality/utils/arrow.py	`100.00% <ø> (ø)`
dataquality/loggers/model_logger/seq2seq.py	`91.80% <75.00%> (ø)`
dataquality/loggers/data_logger/seq2seq.py	`95.06% <100.00%> (ø)`
dataquality/schemas/seq2seq.py	`100.00% <100.00%> (ø)`

... and 2 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Small cleanup leading up to S2S sprint during onsite week

Documentation and input / output cols

3b8333c

elboy3 requested review from a team and dcaustin33 as code owners July 18, 2023 00:24

setu4993 approved these changes Jul 18, 2023

View reviewed changes

anthonycorletti approved these changes Jul 18, 2023

View reviewed changes

elboy3 merged commit 7db3ef4 into main Jul 18, 2023

elboy3 deleted the chore/s2s-cleanup branch July 18, 2023 19:51

bogdan-galileo pushed a commit that referenced this pull request Jul 21, 2023

Chore: Documentation and input / output cols (#715)

e15f3b6

Small cleanup leading up to S2S sprint during onsite week

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chore: Documentation and input / output cols #715

Chore: Documentation and input / output cols #715

elboy3 commented Jul 18, 2023

setu4993 Jul 18, 2023

elboy3 Jul 18, 2023

setu4993 Jul 18, 2023

setu4993 Jul 18, 2023

elboy3 Jul 18, 2023

setu4993 Jul 18, 2023

codecov-commenter commented Jul 18, 2023 •

edited

Loading

	data[C.inference_name.value] = [self.inference_name] * batch_size
	data[C.inference_name] = [self.inference_name] * batch_size

Chore: Documentation and input / output cols #715

Chore: Documentation and input / output cols #715

Conversation

elboy3 commented Jul 18, 2023

setu4993 Jul 18, 2023

Choose a reason for hiding this comment

elboy3 Jul 18, 2023

Choose a reason for hiding this comment

setu4993 Jul 18, 2023

Choose a reason for hiding this comment

setu4993 Jul 18, 2023

Choose a reason for hiding this comment

elboy3 Jul 18, 2023

Choose a reason for hiding this comment

setu4993 Jul 18, 2023

Choose a reason for hiding this comment

codecov-commenter commented Jul 18, 2023 • edited Loading

Codecov Report

codecov-commenter commented Jul 18, 2023 •

edited

Loading