Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPARKNLP-667: Added try-catch block for custom pattern/char #13291

Conversation

DevinTDHa
Copy link
Member

Description

Restores previous behavior, where all exceptions were caught while applying custom patterns/chars in TokenizerModel.
If a user provided pattern/char can not be applied, a message will be logged instead of throwing an exception

How Has This Been Tested?

All tests passing

- if a user provided pattern/char can not be applied, a
  message will be logged instead of throwing an exception
@coveralls
Copy link

Pull Request Test Coverage Report for Build 3807679114

  • 11 of 13 (84.62%) changed or added relevant lines in 1 file are covered.
  • 7 unchanged lines in 7 files lost coverage.
  • Overall coverage decreased (-0.03%) to 68.085%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/main/scala/com/johnsnowlabs/nlp/annotators/TokenizerModel.scala 11 13 84.62%
Files with Coverage Reduction New Missed Lines %
src/main/scala/com/johnsnowlabs/nlp/annotators/er/EntityRulerApproach.scala 1 95.74%
src/main/scala/com/johnsnowlabs/nlp/annotators/ner/crf/FeatureGenerator.scala 1 94.41%
src/main/scala/com/johnsnowlabs/nlp/annotators/ner/dl/NerDLApproach.scala 1 80.51%
src/main/scala/com/johnsnowlabs/nlp/annotators/ner/NerTagsEncoding.scala 1 72.73%
src/main/scala/com/johnsnowlabs/nlp/annotators/sentence_detector_dl/SentenceDetectorDLModel.scala 1 82.29%
src/main/scala/com/johnsnowlabs/nlp/annotators/spell/util/Utilities.scala 1 98.0%
src/main/scala/com/johnsnowlabs/nlp/util/io/OutputHelper.scala 1 41.18%
Totals Coverage Status
Change from base Build 3805504063: -0.03%
Covered Lines: 8734
Relevant Lines: 12828

💛 - Coveralls

@maziyarpanahi maziyarpanahi merged commit cde336a into JohnSnowLabs:release/427-release-candidate Dec 31, 2022
maziyarpanahi added a commit that referenced this pull request Jan 12, 2023
* Removed duplicated method definition (#13280)

Removed the duplicated definition of method `setWeightedDistPath` from `ContextSpellCheckerApproach`.

* SPARKNLP-703 Fix Finisher outputAnnotatorType Issue (#13282)

* SPARKNLP-703 adding control to avoid loading outputAnnotatorType attribute when components don't override it

* SPARKNLP-703 Adding validation when PipelineModel is part of stages

* SPARKNLP-667: Fix indexing issue for custom pattern (#13283)

- fix for patterns with lookahead/behinds that have 0 width matches, indexes would not be calculated correctly
- resolved some warnings
- refactored tokenizer tests and added new index alignment check

* SPARKNLP-708 Enabling embeddings output in LightPipeline.fullAnnotate (#13284)

Co-authored-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* Bump version to 4.2.7

* SPARKNLP-667: Added try-catch block for custom pattern/char (#13291)

- if a user provided pattern/char can not be applied, a
  message will be logged instead of throwing an exception

* Enable dropInvalid in reading photos

* disable `assemble an image input` unit test

- this unit test fails randomly for either a `javax.imageio.IIOException: Unsupported Image Type` or bad assert of `annotationImage.height`. Which suggests something is happening on the OS/file system level as if you re-try it will pass

* SPARKNLP-713 Modifies Default Values GraphExtraction (#13305)

* SPARKNLP-713 Modifies default values of explodeEntities and mergeEntities

* SPARKNLP-713 Refactor GraphFinisher Tests

* SPARKNLP-713 Adding warning message for empty paths

* Fix links for APIs in Open Source (#13312)

* Update 2022-09-27-finassertion_time_en.md

* Update 2022-08-17-finner_orgs_prods_alias_en_3_2.md

* Update 2022-08-17-legner_orgs_prods_alias_en_3_2.md

* Update fin/leg clf models' benchmark (#13276)

* relese note for 4.5.0 including gif (#13301)

Co-authored-by: pranab <pranab@johnsnowlabs.com>
Co-authored-by: diatrambitas <JSL.Git2018>

* Databricks installation instructions update. (#13261)

* Databricks installation instructions update.

* updated DB installation steps

Co-authored-by: diatrambitas <JSL.Git2018>

* Update 2022-09-27-legassertion_time_en.md

* Input output images (#13310)

* [skip test] Fix links for APIs in Open Source

Co-authored-by: Jose J. Martinez <36634572+josejuanmartinez@users.noreply.github.com>
Co-authored-by: Bünyamin Polat <78386903+bunyamin-polat@users.noreply.github.com>
Co-authored-by: rpranab <33893292+rpranab@users.noreply.github.com>
Co-authored-by: pranab <pranab@johnsnowlabs.com>
Co-authored-by: Jiri Dobes <44785730+jdobes-cz@users.noreply.github.com>
Co-authored-by: Lev <agsfer@gmail.com>

* SPARKNLP-715 Fix sentence index computation (#13318)

* Update CHANGELOG for 4.2.7 [run doc]

* Update Scala and Python APIs

* Release Spark NLP 4.2.7 on Conda [skip test]

Co-authored-by: David Cecchini <dadachini@hotmail.com>
Co-authored-by: Danilo Burbano <37355249+danilojsl@users.noreply.github.com>
Co-authored-by: Devin Ha <33089471+DevinTDHa@users.noreply.github.com>
Co-authored-by: Jose J. Martinez <36634572+josejuanmartinez@users.noreply.github.com>
Co-authored-by: Bünyamin Polat <78386903+bunyamin-polat@users.noreply.github.com>
Co-authored-by: rpranab <33893292+rpranab@users.noreply.github.com>
Co-authored-by: pranab <pranab@johnsnowlabs.com>
Co-authored-by: Jiri Dobes <44785730+jdobes-cz@users.noreply.github.com>
Co-authored-by: Lev <agsfer@gmail.com>
Co-authored-by: github-actions <action@github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants