Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPARKNLP-746: Handle empty validation sets #13615

Conversation

DevinTDHa
Copy link
Member

Description

Users can set a validation split when training a ClassifierDLApproach or MultiClassifierDLApproach. However, if this fraction and training data count is low, it can result in empty validation. Currently, this will cause an uncaught error. This PR fixes this and prints a warning.

Changes:

  • handle and print warning when insufficient training data with low validation split produces empty validation set
  • resolved some warnings

How Has This Been Tested?

Added new tests to reflect the behavior. Tests are passing.

Screenshots (if appropriate):

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • Code improvements with no or little impact
  • New feature (non-breaking change which adds functionality)

- handle and print warning when insufficient training data
  with low validation split produces empty validation set
- resolved some warnings
@maziyarpanahi maziyarpanahi changed the base branch from master to release/432-release-candidate March 14, 2023 08:49
@maziyarpanahi maziyarpanahi merged commit bad5435 into JohnSnowLabs:release/432-release-candidate Mar 14, 2023
maziyarpanahi pushed a commit that referenced this pull request May 10, 2023
- handle and print warning when insufficient training data
  with low validation split produces empty validation set
- resolved some warnings
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-fix DON'T MERGE Do not merge this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants