Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to split data into train/test/validate sets #149

Merged
merged 23 commits into from
Nov 5, 2019

Conversation

martham93
Copy link
Contributor

@martham93 martham93 commented Oct 25, 2019

The PR to address the enhancement outlined in issue 147 introduces the option to split data into added train/test/validate of user specified sizes, this code still keeps the previous label-maker default of.8/.2 train/test split.

cc @wronk , @drewbo

@martham93 martham93 requested a review from wronk October 25, 2019 19:35
@martham93 martham93 closed this Oct 28, 2019
@martham93 martham93 reopened this Oct 28, 2019
Copy link
Contributor

@wronk wronk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested some improvements. Main one is around allowing an arbitrary number of splits rather than just 2 or 3. I think we'd want to let people split things as they see fit, but lmk what you think.

label_maker/package.py Outdated Show resolved Hide resolved
label_maker/package.py Outdated Show resolved Hide resolved
label_maker/package.py Outdated Show resolved Hide resolved
label_maker/package.py Outdated Show resolved Hide resolved
label_maker/package.py Outdated Show resolved Hide resolved
label_maker/package.py Show resolved Hide resolved
label_maker/package.py Outdated Show resolved Hide resolved
label_maker/package.py Outdated Show resolved Hide resolved
label_maker/validate.py Outdated Show resolved Hide resolved
test/integration/test_classification_package.py Outdated Show resolved Hide resolved
@wronk
Copy link
Contributor

wronk commented Oct 29, 2019

@martham93, can you also update/add the relevant params to the docs? You can just follow the format of the other params

Here's the relevant page

Then you should be able to run make html from the docs/ directory to generate the new documentation.

@wronk wronk self-requested a review October 30, 2019 17:43
Copy link
Contributor

@wronk wronk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martham93, minor changes to make, but looks good overall. Good work

@drewbo, want to have a quick once over whenever you're back?

docs/parameters.rst Outdated Show resolved Hide resolved
docs/parameters.rst Outdated Show resolved Hide resolved
@martham93 martham93 requested a review from wronk October 30, 2019 18:01
@drewbo drewbo self-requested a review November 4, 2019 15:39
Copy link
Contributor

@wronk wronk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@martham93
Copy link
Contributor Author

is it okay if I merge this @drewbo @wronk ?

@martham93 martham93 merged commit ee40b24 into master Nov 5, 2019
@martham93 martham93 deleted the train_test_val branch July 7, 2020 22:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants