Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perform separate validation and test epochs per dataset when multiple files are specified (Fixes #1634 and #2043) #2038

Merged
merged 6 commits into from
Apr 16, 2019

Conversation

reuben
Copy link
Contributor

@reuben reuben commented Apr 10, 2019

As a consequence this also removes support for caching to disk the processed validation and test sets. They're usually small enough that caching to disk makes very little difference, and the complexity of keeping track of multiple cache paths for each CSV is not worth it IMO.

Sorry for the continued churn on the progress bar/logging part of the code. This PR also includes some cleanup of that code using progressbar.NullBar so that there's now a single check for the --show_progressbar flag. With this PR, the log with and without progress bars should be consistent in the information it provides.

I recommend disabling whitespace changes for reviewing the last commit that touches evaluate.py, otherwise the diff ends up looking like a mess.

@reuben reuben requested a review from tilmankamp April 10, 2019 20:24
@reuben reuben changed the title Perform separate validation and test epochs per dataset when multiple files are specified Perform separate validation and test epochs per dataset when multiple files are specified (Fixes #1634) Apr 10, 2019
DeepSpeech.py Outdated
set_loss = run_set('dev', epoch, init_op, dataset=csv)
dev_loss += set_loss
log_progress('Finished validating epoch %d on %s - loss: %f' % (epoch, csv, set_loss))
dev_loss = dev_loss / len(dev_csvs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does diving by len(dev_csvs) give the same result as before this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, if the sizes of the dev sets are unbalanced then the mean of sample means will not match the mean of the whole population. We can get to the same result by doing a weighted average if we know the sample size, will have to return that value from the run_set function since the dataset size is only known after a full iteration has been done through it. I don't expect the difference to be large but I'll fix this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

Copy link
Contributor

@tilmankamp tilmankamp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know the train-time implications of not having cached features for validation and testing?

evaluate.py Outdated Show resolved Hide resolved
DeepSpeech.py Outdated Show resolved Hide resolved
util/flags.py Outdated Show resolved Hide resolved
@reuben
Copy link
Contributor Author

reuben commented Apr 11, 2019

Validation epochs are normally so fast that I don't think I've ever actually cached them in runs, so I can't give you numbers, but that goes to show that it probably does not make a huge difference :)

As for test epochs, the acoustic model prediction is usually done in the same time as the validation sets, it's the decoding that takes the majority of the time.

@reuben reuben force-pushed the split-dev-test-epochs branch from 9076adf to 946828c Compare April 11, 2019 14:09
@reuben reuben changed the title Perform separate validation and test epochs per dataset when multiple files are specified (Fixes #1634) Perform separate validation and test epochs per dataset when multiple files are specified (Fixes #1634 and #2043) Apr 11, 2019
Copy link
Contributor

@tilmankamp tilmankamp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but before merging all checks should pass.

@reuben
Copy link
Contributor Author

reuben commented Apr 12, 2019

All green.

@reuben
Copy link
Contributor Author

reuben commented Apr 15, 2019

@kdavis-mozilla review ping

@kdavis-mozilla
Copy link
Contributor

@reuben As I mentioned I dont' have time to do the review. If tilman gave it a r+, I'm fine with that.

@kdavis-mozilla kdavis-mozilla removed their request for review April 16, 2019 08:01
@reuben reuben force-pushed the split-dev-test-epochs branch from 2351bf1 to 904ab1e Compare April 16, 2019 14:07
@reuben reuben merged commit 1e601d5 into master Apr 16, 2019
@reuben reuben deleted the split-dev-test-epochs branch April 16, 2019 15:23
@lock
Copy link

lock bot commented May 16, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators May 16, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants