Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SparkNLP 1018 - Introducing NLLB #14209

Conversation

prabod
Copy link
Contributor

@prabod prabod commented Mar 19, 2024

Description

Meta AI has built a single AI model, NLLB-200, that is the first to translate across 200 different languages with state-of-the-art quality that has been validated through extensive evaluations for each of them.

Types of changes

  • New feature (non-breaking change which adds functionality)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING page.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@prabod prabod added new-feature Introducing a new feature new model DON'T MERGE Do not merge this PR labels Mar 19, 2024
@prabod prabod self-assigned this Mar 19, 2024
@prabod prabod force-pushed the SPARKNLP-1018-Add-SentencePiece-support-to-M2M100 branch from 79a0e5c to 0310504 Compare July 15, 2024 06:51
@prabod prabod marked this pull request as ready for review July 22, 2024 10:09
@maziyarpanahi maziyarpanahi changed the base branch from master to release/550-release-candidate September 1, 2024 18:09
@maziyarpanahi maziyarpanahi merged commit c68be6a into release/550-release-candidate Sep 1, 2024
4 checks passed
@coveralls
Copy link

Pull Request Test Coverage Report for Build 10656276506

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage decreased (-0.007%) to 62.415%

Files with Coverage Reduction New Missed Lines %
src/main/scala/com/johnsnowlabs/nlp/annotators/sentence_detector_dl/SentenceDetectorDLModel.scala 1 81.68%
Totals Coverage Status
Change from base Build 10656260334: -0.007%
Covered Lines: 8969
Relevant Lines: 14370

💛 - Coveralls

@yaronshanisima
Copy link

HI @prabod and thank you for the contribution!
Do you if this is still a work in progress?
I ask because when I run the example code here I get:
Can not find the model to download please check the name! with just the default pretrained value.
Also, when I try to run a query to SparkNLP model hub, I don't see any NLLB models. Is this something that will be added in the future? Do you have the models that you run tests on?
Thanks!

@maziyarpanahi
Copy link
Member

HI @prabod and thank you for the contribution! Do you if this is still a work in progress? I ask because when I run the example code here I get: Can not find the model to download please check the name! with just the default pretrained value. Also, when I try to run a query to SparkNLP model hub, I don't see any NLLB models. Is this something that will be added in the future? Do you have the models that you run tests on? Thanks!

Hi,

This feature will be completed and released in the upcoming version in a week or two. At the moment it is missing some final pieces.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DON'T MERGE Do not merge this PR new model new-feature Introducing a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants