-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Chore: Documentation and input / output cols #715
Conversation
@@ -135,7 +154,7 @@ def _get_data_dict(self) -> Dict: | |||
C.epoch.value: [self.epoch] * batch_size, | |||
} | |||
if self.split == Split.inference: | |||
data["inference_name"] = [self.inference_name] * batch_size | |||
data[C.inference_name.value] = [self.inference_name] * batch_size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a .value
here? I think it works simply, too?
data[C.inference_name.value] = [self.inference_name] * batch_size | |
data[C.inference_name] = [self.inference_name] * batch_size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we do! This creates a dict that later gets cast to a vaex DataFrame, which can't have enum as the keys to the dict
The following errors:
from enum import Enum
import vaex
class MyType(str, Enum):
a = "a"
b = "b"
c = "c"
df = vaex.from_dict(
{
MyType.a: [1, 2, 3],
MyType.b: [4, 5, 6],
MyType.c: [7, 8, 9],
}
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL!
token_deps = "token_deps" | ||
token_gold_probs = "token_gold_probs" | ||
# Mypy complained about split as an attribute, so we use `split_` | ||
split_ = "split" | ||
epoch = "epoch" | ||
inference_name = "inference_name" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why inference_name
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For line 157 in dataquality/loggers/model_logger/seq2seq.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My question was why call it inference name? Name as a column name is too generic and doesn't feel like that's what we should be getting?
Codecov Report
@@ Coverage Diff @@
## main #715 +/- ##
=======================================
Coverage 89.69% 89.70%
=======================================
Files 166 166
Lines 13320 13323 +3
=======================================
+ Hits 11948 11951 +3
Misses 1372 1372
... and 2 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Small cleanup leading up to S2S sprint during onsite week
Small cleanup leading up to S2S sprint during onsite week