diff --git a/docs/_posts/ahmedlone127/2023-09-12-bert_base_dutch_cased_finetuned_mark_en.md b/docs/_posts/ahmedlone127/2023-09-12-bert_base_dutch_cased_finetuned_mark_en.md new file mode 100644 index 00000000000000..1ea2c73664ac48 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-12-bert_base_dutch_cased_finetuned_mark_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_dutch_cased_finetuned_mark BertEmbeddings from markverschuren +author: John Snow Labs +name: bert_base_dutch_cased_finetuned_mark +date: 2023-09-12 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_dutch_cased_finetuned_mark` is a English model originally trained by markverschuren. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_dutch_cased_finetuned_mark_en_5.1.1_3.0_1694551719944.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_dutch_cased_finetuned_mark_en_5.1.1_3.0_1694551719944.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_dutch_cased_finetuned_mark","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_dutch_cased_finetuned_mark", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_dutch_cased_finetuned_mark| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|406.8 MB| + +## References + +https://huggingface.co/markverschuren/bert-base-dutch-cased-finetuned-mark \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-12-legal_bert_small_uncased_en.md b/docs/_posts/ahmedlone127/2023-09-12-legal_bert_small_uncased_en.md new file mode 100644 index 00000000000000..433d1f39c8066a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-12-legal_bert_small_uncased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English legal_bert_small_uncased BertEmbeddings from nlpaueb +author: John Snow Labs +name: legal_bert_small_uncased +date: 2023-09-12 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`legal_bert_small_uncased` is a English model originally trained by nlpaueb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/legal_bert_small_uncased_en_5.1.1_3.0_1694561644609.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/legal_bert_small_uncased_en_5.1.1_3.0_1694561644609.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("legal_bert_small_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("legal_bert_small_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|legal_bert_small_uncased| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|130.6 MB| + +## References + +https://huggingface.co/nlpaueb/legal-bert-small-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-adopted_bert_base_cased_en.md b/docs/_posts/ahmedlone127/2023-09-13-adopted_bert_base_cased_en.md new file mode 100644 index 00000000000000..5a74a536c75295 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-adopted_bert_base_cased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English adopted_bert_base_cased BertEmbeddings from sivanravid +author: John Snow Labs +name: adopted_bert_base_cased +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`adopted_bert_base_cased` is a English model originally trained by sivanravid. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/adopted_bert_base_cased_en_5.1.1_3.0_1694617850169.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/adopted_bert_base_cased_en_5.1.1_3.0_1694617850169.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("adopted_bert_base_cased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("adopted_bert_base_cased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|adopted_bert_base_cased| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/sivanravid/adopted-bert-base-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-aivengers_bert_finetuned_en.md b/docs/_posts/ahmedlone127/2023-09-13-aivengers_bert_finetuned_en.md new file mode 100644 index 00000000000000..211ca83e5c0862 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-aivengers_bert_finetuned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English aivengers_bert_finetuned BertEmbeddings from dkqp +author: John Snow Labs +name: aivengers_bert_finetuned +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`aivengers_bert_finetuned` is a English model originally trained by dkqp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/aivengers_bert_finetuned_en_5.1.1_3.0_1694620043636.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/aivengers_bert_finetuned_en_5.1.1_3.0_1694620043636.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("aivengers_bert_finetuned","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("aivengers_bert_finetuned", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|aivengers_bert_finetuned| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.0 MB| + +## References + +https://huggingface.co/dkqp/AiVENGERS_BERT_FineTuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-aivengers_multilingual_base_xx.md b/docs/_posts/ahmedlone127/2023-09-13-aivengers_multilingual_base_xx.md new file mode 100644 index 00000000000000..5d063928b4a497 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-aivengers_multilingual_base_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual aivengers_multilingual_base BertEmbeddings from kimjae +author: John Snow Labs +name: aivengers_multilingual_base +date: 2023-09-13 +tags: [bert, xx, open_source, fill_mask, onnx] +task: Embeddings +language: xx +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`aivengers_multilingual_base` is a Multilingual model originally trained by kimjae. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/aivengers_multilingual_base_xx_5.1.1_3.0_1694632030712.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/aivengers_multilingual_base_xx_5.1.1_3.0_1694632030712.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("aivengers_multilingual_base","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("aivengers_multilingual_base", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|aivengers_multilingual_base| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|xx| +|Size:|665.0 MB| + +## References + +https://huggingface.co/kimjae/aivengers_multilingual_base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-alberti_bert_base_multilingual_cased_flax_community_xx.md b/docs/_posts/ahmedlone127/2023-09-13-alberti_bert_base_multilingual_cased_flax_community_xx.md new file mode 100644 index 00000000000000..dc42e3991efcd5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-alberti_bert_base_multilingual_cased_flax_community_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual alberti_bert_base_multilingual_cased_flax_community BertEmbeddings from flax-community +author: John Snow Labs +name: alberti_bert_base_multilingual_cased_flax_community +date: 2023-09-13 +tags: [bert, xx, open_source, fill_mask, onnx] +task: Embeddings +language: xx +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`alberti_bert_base_multilingual_cased_flax_community` is a Multilingual model originally trained by flax-community. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/alberti_bert_base_multilingual_cased_flax_community_xx_5.1.1_3.0_1694642221069.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/alberti_bert_base_multilingual_cased_flax_community_xx_5.1.1_3.0_1694642221069.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("alberti_bert_base_multilingual_cased_flax_community","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("alberti_bert_base_multilingual_cased_flax_community", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|alberti_bert_base_multilingual_cased_flax_community| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|xx| +|Size:|664.4 MB| + +## References + +https://huggingface.co/flax-community/alberti-bert-base-multilingual-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-alephbertgimmel_small_128_he.md b/docs/_posts/ahmedlone127/2023-09-13-alephbertgimmel_small_128_he.md new file mode 100644 index 00000000000000..9a91bffe8d49c9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-alephbertgimmel_small_128_he.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Hebrew alephbertgimmel_small_128 BertEmbeddings from imvladikon +author: John Snow Labs +name: alephbertgimmel_small_128 +date: 2023-09-13 +tags: [bert, he, open_source, fill_mask, onnx] +task: Embeddings +language: he +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`alephbertgimmel_small_128` is a Hebrew model originally trained by imvladikon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/alephbertgimmel_small_128_he_5.1.1_3.0_1694642025855.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/alephbertgimmel_small_128_he_5.1.1_3.0_1694642025855.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("alephbertgimmel_small_128","he") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("alephbertgimmel_small_128", "he") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|alephbertgimmel_small_128| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|he| +|Size:|295.5 MB| + +## References + +https://huggingface.co/imvladikon/alephbertgimmel-small-128 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-answer_model_bert_base_uncased_en.md b/docs/_posts/ahmedlone127/2023-09-13-answer_model_bert_base_uncased_en.md new file mode 100644 index 00000000000000..65cf8254efadbd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-answer_model_bert_base_uncased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English answer_model_bert_base_uncased BertEmbeddings from Mayank393 +author: John Snow Labs +name: answer_model_bert_base_uncased +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`answer_model_bert_base_uncased` is a English model originally trained by Mayank393. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/answer_model_bert_base_uncased_en_5.1.1_3.0_1694619524376.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/answer_model_bert_base_uncased_en_5.1.1_3.0_1694619524376.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("answer_model_bert_base_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("answer_model_bert_base_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|answer_model_bert_base_uncased| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/Mayank393/Answer_Model_Bert_Base_uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-arab_bert_en.md b/docs/_posts/ahmedlone127/2023-09-13-arab_bert_en.md new file mode 100644 index 00000000000000..551c27abf4ba4f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-arab_bert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English arab_bert BertEmbeddings from MutazYoune +author: John Snow Labs +name: arab_bert +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`arab_bert` is a English model originally trained by MutazYoune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/arab_bert_en_5.1.1_3.0_1694639723986.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/arab_bert_en_5.1.1_3.0_1694639723986.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("arab_bert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("arab_bert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|arab_bert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|406.4 MB| + +## References + +https://huggingface.co/MutazYoune/ARAB_BERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-arabertautomodelformaskedlm_en.md b/docs/_posts/ahmedlone127/2023-09-13-arabertautomodelformaskedlm_en.md new file mode 100644 index 00000000000000..38570a5356f300 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-arabertautomodelformaskedlm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English arabertautomodelformaskedlm BertEmbeddings from oknashar +author: John Snow Labs +name: arabertautomodelformaskedlm +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`arabertautomodelformaskedlm` is a English model originally trained by oknashar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/arabertautomodelformaskedlm_en_5.1.1_3.0_1694615119192.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/arabertautomodelformaskedlm_en_5.1.1_3.0_1694615119192.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("arabertautomodelformaskedlm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("arabertautomodelformaskedlm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|arabertautomodelformaskedlm| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|504.6 MB| + +## References + +https://huggingface.co/oknashar/arabertAutoModelForMaskedLM \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-assamese_bert_as.md b/docs/_posts/ahmedlone127/2023-09-13-assamese_bert_as.md new file mode 100644 index 00000000000000..e63db8e5e431b3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-assamese_bert_as.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Assamese assamese_bert BertEmbeddings from l3cube-pune +author: John Snow Labs +name: assamese_bert +date: 2023-09-13 +tags: [bert, as, open_source, fill_mask, onnx] +task: Embeddings +language: as +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`assamese_bert` is a Assamese model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/assamese_bert_as_5.1.1_3.0_1694642657748.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/assamese_bert_as_5.1.1_3.0_1694642657748.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("assamese_bert","as") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("assamese_bert", "as") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|assamese_bert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|as| +|Size:|890.4 MB| + +## References + +https://huggingface.co/l3cube-pune/assamese-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-b_fb_sms_lm_en.md b/docs/_posts/ahmedlone127/2023-09-13-b_fb_sms_lm_en.md new file mode 100644 index 00000000000000..bb4d7f3bdcbf69 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-b_fb_sms_lm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English b_fb_sms_lm BertEmbeddings from adnankhawaja +author: John Snow Labs +name: b_fb_sms_lm +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`b_fb_sms_lm` is a English model originally trained by adnankhawaja. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/b_fb_sms_lm_en_5.1.1_3.0_1694638757732.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/b_fb_sms_lm_en_5.1.1_3.0_1694638757732.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("b_fb_sms_lm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("b_fb_sms_lm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|b_fb_sms_lm| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/adnankhawaja/B_FB_SMS_LM \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-b_t_fb_lm_en.md b/docs/_posts/ahmedlone127/2023-09-13-b_t_fb_lm_en.md new file mode 100644 index 00000000000000..2f4710361289f9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-b_t_fb_lm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English b_t_fb_lm BertEmbeddings from adnankhawaja +author: John Snow Labs +name: b_t_fb_lm +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`b_t_fb_lm` is a English model originally trained by adnankhawaja. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/b_t_fb_lm_en_5.1.1_3.0_1694637241499.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/b_t_fb_lm_en_5.1.1_3.0_1694637241499.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("b_t_fb_lm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("b_t_fb_lm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|b_t_fb_lm| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/adnankhawaja/B_T_FB_LM \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-b_t_sms_lm_en.md b/docs/_posts/ahmedlone127/2023-09-13-b_t_sms_lm_en.md new file mode 100644 index 00000000000000..eb825775e44d97 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-b_t_sms_lm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English b_t_sms_lm BertEmbeddings from adnankhawaja +author: John Snow Labs +name: b_t_sms_lm +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`b_t_sms_lm` is a English model originally trained by adnankhawaja. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/b_t_sms_lm_en_5.1.1_3.0_1694638241378.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/b_t_sms_lm_en_5.1.1_3.0_1694638241378.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("b_t_sms_lm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("b_t_sms_lm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|b_t_sms_lm| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/adnankhawaja/B_T_SMS_LM \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-backbone_bertnsp_600_en.md b/docs/_posts/ahmedlone127/2023-09-13-backbone_bertnsp_600_en.md new file mode 100644 index 00000000000000..246af4e1869171 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-backbone_bertnsp_600_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English backbone_bertnsp_600 BertEmbeddings from approach0 +author: John Snow Labs +name: backbone_bertnsp_600 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`backbone_bertnsp_600` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/backbone_bertnsp_600_en_5.1.1_3.0_1694618178947.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/backbone_bertnsp_600_en_5.1.1_3.0_1694618178947.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("backbone_bertnsp_600","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("backbone_bertnsp_600", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|backbone_bertnsp_600| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/approach0/backbone-bertnsp-600 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-backbone_cocomae_600_en.md b/docs/_posts/ahmedlone127/2023-09-13-backbone_cocomae_600_en.md new file mode 100644 index 00000000000000..ab45a99b6dd697 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-backbone_cocomae_600_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English backbone_cocomae_600 BertEmbeddings from approach0 +author: John Snow Labs +name: backbone_cocomae_600 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`backbone_cocomae_600` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/backbone_cocomae_600_en_5.1.1_3.0_1694617810215.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/backbone_cocomae_600_en_5.1.1_3.0_1694617810215.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("backbone_cocomae_600","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("backbone_cocomae_600", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|backbone_cocomae_600| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/approach0/backbone-cocomae-600 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-backbone_cocondenser_600_en.md b/docs/_posts/ahmedlone127/2023-09-13-backbone_cocondenser_600_en.md new file mode 100644 index 00000000000000..9756e971664562 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-backbone_cocondenser_600_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English backbone_cocondenser_600 BertEmbeddings from approach0 +author: John Snow Labs +name: backbone_cocondenser_600 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`backbone_cocondenser_600` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/backbone_cocondenser_600_en_5.1.1_3.0_1694618588688.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/backbone_cocondenser_600_en_5.1.1_3.0_1694618588688.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("backbone_cocondenser_600","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("backbone_cocondenser_600", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|backbone_cocondenser_600| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/approach0/backbone-cocondenser-600 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-backbone_cotmae_600_en.md b/docs/_posts/ahmedlone127/2023-09-13-backbone_cotmae_600_en.md new file mode 100644 index 00000000000000..4dfcb19b4d69c4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-backbone_cotmae_600_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English backbone_cotmae_600 BertEmbeddings from approach0 +author: John Snow Labs +name: backbone_cotmae_600 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`backbone_cotmae_600` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/backbone_cotmae_600_en_5.1.1_3.0_1694619025139.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/backbone_cotmae_600_en_5.1.1_3.0_1694619025139.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("backbone_cotmae_600","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("backbone_cotmae_600", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|backbone_cotmae_600| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/approach0/backbone-cotmae-600 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-batteryscibert_cased_en.md b/docs/_posts/ahmedlone127/2023-09-13-batteryscibert_cased_en.md new file mode 100644 index 00000000000000..f08dbd892aff21 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-batteryscibert_cased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English batteryscibert_cased BertEmbeddings from batterydata +author: John Snow Labs +name: batteryscibert_cased +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`batteryscibert_cased` is a English model originally trained by batterydata. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/batteryscibert_cased_en_5.1.1_3.0_1694585574709.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/batteryscibert_cased_en_5.1.1_3.0_1694585574709.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("batteryscibert_cased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("batteryscibert_cased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|batteryscibert_cased| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/batterydata/batteryscibert-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bengali_bert_bn.md b/docs/_posts/ahmedlone127/2023-09-13-bengali_bert_bn.md new file mode 100644 index 00000000000000..0ae5fbf956517b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bengali_bert_bn.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Bengali bengali_bert BertEmbeddings from l3cube-pune +author: John Snow Labs +name: bengali_bert +date: 2023-09-13 +tags: [bert, bn, open_source, fill_mask, onnx] +task: Embeddings +language: bn +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bengali_bert` is a Bengali model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bengali_bert_bn_5.1.1_3.0_1694644522311.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bengali_bert_bn_5.1.1_3.0_1694644522311.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bengali_bert","bn") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bengali_bert", "bn") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bengali_bert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|bn| +|Size:|890.5 MB| + +## References + +https://huggingface.co/l3cube-pune/bengali-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-berdou_200k_en.md b/docs/_posts/ahmedlone127/2023-09-13-berdou_200k_en.md new file mode 100644 index 00000000000000..6dcb474ef98322 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-berdou_200k_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English berdou_200k BertEmbeddings from flavio-nakasato +author: John Snow Labs +name: berdou_200k +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`berdou_200k` is a English model originally trained by flavio-nakasato. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/berdou_200k_en_5.1.1_3.0_1694641173323.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/berdou_200k_en_5.1.1_3.0_1694641173323.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("berdou_200k","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("berdou_200k", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|berdou_200k| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/flavio-nakasato/berdou_200k \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-berdou_500k_en.md b/docs/_posts/ahmedlone127/2023-09-13-berdou_500k_en.md new file mode 100644 index 00000000000000..5e89f0cbc0081d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-berdou_500k_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English berdou_500k BertEmbeddings from flavio-nakasato +author: John Snow Labs +name: berdou_500k +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`berdou_500k` is a English model originally trained by flavio-nakasato. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/berdou_500k_en_5.1.1_3.0_1694641766005.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/berdou_500k_en_5.1.1_3.0_1694641766005.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("berdou_500k","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("berdou_500k", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|berdou_500k| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/flavio-nakasato/berdou_500k \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_arabert_ar.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_arabert_ar.md new file mode 100644 index 00000000000000..5c99d91275cbaa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_arabert_ar.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Arabic bert_base_arabert BertEmbeddings from aubmindlab +author: John Snow Labs +name: bert_base_arabert +date: 2023-09-13 +tags: [bert, ar, open_source, fill_mask, onnx] +task: Embeddings +language: ar +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_arabert` is a Arabic model originally trained by aubmindlab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_arabert_ar_5.1.1_3.0_1694582655825.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_arabert_ar_5.1.1_3.0_1694582655825.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_arabert","ar") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_arabert", "ar") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_arabert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|ar| +|Size:|504.6 MB| + +## References + +https://huggingface.co/aubmindlab/bert-base-arabert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_arabic_miner_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_arabic_miner_en.md new file mode 100644 index 00000000000000..cae8a1248a1c4f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_arabic_miner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_arabic_miner BertEmbeddings from giganticode +author: John Snow Labs +name: bert_base_arabic_miner +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_arabic_miner` is a English model originally trained by giganticode. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_arabic_miner_en_5.1.1_3.0_1694649546700.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_arabic_miner_en_5.1.1_3.0_1694649546700.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_arabic_miner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_arabic_miner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_arabic_miner| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/giganticode/bert-base-ar_miner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_arapoembert_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_arapoembert_en.md new file mode 100644 index 00000000000000..c4614ae96da625 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_arapoembert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_arapoembert BertEmbeddings from faisalq +author: John Snow Labs +name: bert_base_arapoembert +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_arapoembert` is a English model originally trained by faisalq. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_arapoembert_en_5.1.1_3.0_1694614180429.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_arapoembert_en_5.1.1_3.0_1694614180429.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_arapoembert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_arapoembert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_arapoembert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|406.9 MB| + +## References + +https://huggingface.co/faisalq/bert-base-arapoembert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_10_mlm_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_10_mlm_en.md new file mode 100644 index 00000000000000..bf002b47398ea9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_10_mlm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_10_mlm BertEmbeddings from rithwik-db +author: John Snow Labs +name: bert_base_cased_10_mlm +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_10_mlm` is a English model originally trained by rithwik-db. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_10_mlm_en_5.1.1_3.0_1694615520000.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_10_mlm_en_5.1.1_3.0_1694615520000.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_10_mlm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_10_mlm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_10_mlm| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/rithwik-db/bert-base-cased-10-MLM \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_500_mlm_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_500_mlm_en.md new file mode 100644 index 00000000000000..aaefbd47534a47 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_500_mlm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_500_mlm BertEmbeddings from rithwik-db +author: John Snow Labs +name: bert_base_cased_500_mlm +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_500_mlm` is a English model originally trained by rithwik-db. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_500_mlm_en_5.1.1_3.0_1694616384008.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_500_mlm_en_5.1.1_3.0_1694616384008.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_500_mlm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_500_mlm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_500_mlm| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.5 MB| + +## References + +https://huggingface.co/rithwik-db/bert-base-cased-500-MLM \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_50_mlm_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_50_mlm_en.md new file mode 100644 index 00000000000000..5987f965b1449e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_50_mlm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_50_mlm BertEmbeddings from rithwik-db +author: John Snow Labs +name: bert_base_cased_50_mlm +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_50_mlm` is a English model originally trained by rithwik-db. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_50_mlm_en_5.1.1_3.0_1694616027167.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_50_mlm_en_5.1.1_3.0_1694616027167.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_50_mlm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_50_mlm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_50_mlm| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/rithwik-db/bert-base-cased-50-MLM \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_b4h7_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_b4h7_en.md new file mode 100644 index 00000000000000..aaae0d23b50f37 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_b4h7_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_b4h7 BertEmbeddings from mdroth +author: John Snow Labs +name: bert_base_cased_b4h7 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_b4h7` is a English model originally trained by mdroth. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_b4h7_en_5.1.1_3.0_1694626233588.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_b4h7_en_5.1.1_3.0_1694626233588.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_b4h7","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_b4h7", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_b4h7| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/mdroth/bert-base-cased_B4H7 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_finetuned_bert_auto7_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_finetuned_bert_auto7_en.md new file mode 100644 index 00000000000000..348063868c202e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_finetuned_bert_auto7_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_finetuned_bert_auto7 BertEmbeddings from himanimaheshwari3 +author: John Snow Labs +name: bert_base_cased_finetuned_bert_auto7 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_bert_auto7` is a English model originally trained by himanimaheshwari3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_bert_auto7_en_5.1.1_3.0_1694639557369.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_bert_auto7_en_5.1.1_3.0_1694639557369.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_finetuned_bert_auto7","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_finetuned_bert_auto7", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_bert_auto7| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/himanimaheshwari3/bert-base-cased-finetuned-bert-auto7 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_finetuned_bert_mlm5_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_finetuned_bert_mlm5_en.md new file mode 100644 index 00000000000000..adbe9eb0786dbb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_finetuned_bert_mlm5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_finetuned_bert_mlm5 BertEmbeddings from himanimaheshwari3 +author: John Snow Labs +name: bert_base_cased_finetuned_bert_mlm5 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_bert_mlm5` is a English model originally trained by himanimaheshwari3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_bert_mlm5_en_5.1.1_3.0_1694642519725.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_bert_mlm5_en_5.1.1_3.0_1694642519725.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_finetuned_bert_mlm5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_finetuned_bert_mlm5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_bert_mlm5| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/himanimaheshwari3/bert-base-cased-finetuned-BERT-mlm5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_googlere_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_googlere_en.md new file mode 100644 index 00000000000000..f8d9e5f2f87878 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_googlere_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_googlere BertEmbeddings from triet1102 +author: John Snow Labs +name: bert_base_cased_googlere +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_googlere` is a English model originally trained by triet1102. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_googlere_en_5.1.1_3.0_1694614086575.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_googlere_en_5.1.1_3.0_1694614086575.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_googlere","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_googlere", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_googlere| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/triet1102/bert-base-cased-GoogleRE \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_googlere_masked_subj_obj_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_googlere_masked_subj_obj_en.md new file mode 100644 index 00000000000000..a2a1a54c42a41e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_googlere_masked_subj_obj_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_googlere_masked_subj_obj BertEmbeddings from triet1102 +author: John Snow Labs +name: bert_base_cased_googlere_masked_subj_obj +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_googlere_masked_subj_obj` is a English model originally trained by triet1102. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_googlere_masked_subj_obj_en_5.1.1_3.0_1694615522376.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_googlere_masked_subj_obj_en_5.1.1_3.0_1694615522376.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_googlere_masked_subj_obj","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_googlere_masked_subj_obj", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_googlere_masked_subj_obj| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/triet1102/bert-base-cased-GoogleRE-masked-subj-obj \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_model_attribution_challenge_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_model_attribution_challenge_en.md new file mode 100644 index 00000000000000..c80b08154ec392 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_model_attribution_challenge_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_model_attribution_challenge BertEmbeddings from model-attribution-challenge +author: John Snow Labs +name: bert_base_cased_model_attribution_challenge +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_model_attribution_challenge` is a English model originally trained by model-attribution-challenge. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_model_attribution_challenge_en_5.1.1_3.0_1694627672776.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_model_attribution_challenge_en_5.1.1_3.0_1694627672776.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_model_attribution_challenge","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_model_attribution_challenge", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_model_attribution_challenge| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/model-attribution-challenge/bert-base-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_portuguese_c_corpus_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_portuguese_c_corpus_en.md new file mode 100644 index 00000000000000..deb118ac14cd10 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_portuguese_c_corpus_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_portuguese_c_corpus BertEmbeddings from rosimeirecosta +author: John Snow Labs +name: bert_base_cased_portuguese_c_corpus +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_portuguese_c_corpus` is a English model originally trained by rosimeirecosta. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_portuguese_c_corpus_en_5.1.1_3.0_1694648701939.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_portuguese_c_corpus_en_5.1.1_3.0_1694648701939.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_portuguese_c_corpus","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_portuguese_c_corpus", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_portuguese_c_corpus| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/rosimeirecosta/bert-base-cased-pt-c-corpus \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_portuguese_lenerbr_vittorio_girardi_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_portuguese_lenerbr_vittorio_girardi_en.md new file mode 100644 index 00000000000000..bc36d06fbffd5e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_cased_portuguese_lenerbr_vittorio_girardi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_portuguese_lenerbr_vittorio_girardi BertEmbeddings from vittorio-girardi +author: John Snow Labs +name: bert_base_cased_portuguese_lenerbr_vittorio_girardi +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_portuguese_lenerbr_vittorio_girardi` is a English model originally trained by vittorio-girardi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_portuguese_lenerbr_vittorio_girardi_en_5.1.1_3.0_1694634389905.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_portuguese_lenerbr_vittorio_girardi_en_5.1.1_3.0_1694634389905.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_portuguese_lenerbr_vittorio_girardi","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_portuguese_lenerbr_vittorio_girardi", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_portuguese_lenerbr_vittorio_girardi| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/vittorio-girardi/bert-base-cased-pt-lenerbr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_chinese_complaint_128_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_chinese_complaint_128_en.md new file mode 100644 index 00000000000000..c6536d7c5700a4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_chinese_complaint_128_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_chinese_complaint_128 BertEmbeddings from xxr +author: John Snow Labs +name: bert_base_chinese_complaint_128 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_complaint_128` is a English model originally trained by xxr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_complaint_128_en_5.1.1_3.0_1694627957165.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_complaint_128_en_5.1.1_3.0_1694627957165.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_chinese_complaint_128","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_chinese_complaint_128", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_complaint_128| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.0 MB| + +## References + +https://huggingface.co/xxr/bert-base-chinese-complaint-128 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_dewiki_v1_de.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_dewiki_v1_de.md new file mode 100644 index 00000000000000..ce40740b60e807 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_dewiki_v1_de.md @@ -0,0 +1,93 @@ +--- +layout: model +title: German bert_base_dewiki_v1 BertEmbeddings from gwlms +author: John Snow Labs +name: bert_base_dewiki_v1 +date: 2023-09-13 +tags: [bert, de, open_source, fill_mask, onnx] +task: Embeddings +language: de +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_dewiki_v1` is a German model originally trained by gwlms. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_dewiki_v1_de_5.1.1_3.0_1694623056420.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_dewiki_v1_de_5.1.1_3.0_1694623056420.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_dewiki_v1","de") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_dewiki_v1", "de") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_dewiki_v1| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|de| +|Size:|412.4 MB| + +## References + +https://huggingface.co/gwlms/bert-base-dewiki-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_generics_mlm_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_generics_mlm_en.md new file mode 100644 index 00000000000000..58af567f01aee7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_generics_mlm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_generics_mlm BertEmbeddings from sello-ralethe +author: John Snow Labs +name: bert_base_generics_mlm +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_generics_mlm` is a English model originally trained by sello-ralethe. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_generics_mlm_en_5.1.1_3.0_1694572846320.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_generics_mlm_en_5.1.1_3.0_1694572846320.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_generics_mlm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_generics_mlm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_generics_mlm| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/sello-ralethe/bert-base-generics-mlm \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_german_cased_finetuned_swiss_de.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_german_cased_finetuned_swiss_de.md new file mode 100644 index 00000000000000..568ac4a96b1c4d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_german_cased_finetuned_swiss_de.md @@ -0,0 +1,93 @@ +--- +layout: model +title: German bert_base_german_cased_finetuned_swiss BertEmbeddings from statworx +author: John Snow Labs +name: bert_base_german_cased_finetuned_swiss +date: 2023-09-13 +tags: [bert, de, open_source, fill_mask, onnx] +task: Embeddings +language: de +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_german_cased_finetuned_swiss` is a German model originally trained by statworx. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_german_cased_finetuned_swiss_de_5.1.1_3.0_1694635969690.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_german_cased_finetuned_swiss_de_5.1.1_3.0_1694635969690.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_german_cased_finetuned_swiss","de") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_german_cased_finetuned_swiss", "de") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_german_cased_finetuned_swiss| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|de| +|Size:|406.9 MB| + +## References + +https://huggingface.co/statworx/bert-base-german-cased-finetuned-swiss \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_historic_multilingual_64k_td_cased_xx.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_historic_multilingual_64k_td_cased_xx.md new file mode 100644 index 00000000000000..8928e9d251c364 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_historic_multilingual_64k_td_cased_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_base_historic_multilingual_64k_td_cased BertEmbeddings from dbmdz +author: John Snow Labs +name: bert_base_historic_multilingual_64k_td_cased +date: 2023-09-13 +tags: [bert, xx, open_source, fill_mask, onnx] +task: Embeddings +language: xx +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_historic_multilingual_64k_td_cased` is a Multilingual model originally trained by dbmdz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_historic_multilingual_64k_td_cased_xx_5.1.1_3.0_1694618977228.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_historic_multilingual_64k_td_cased_xx_5.1.1_3.0_1694618977228.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_historic_multilingual_64k_td_cased","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_historic_multilingual_64k_td_cased", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_historic_multilingual_64k_td_cased| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|xx| +|Size:|504.6 MB| + +## References + +https://huggingface.co/dbmdz/bert-base-historic-multilingual-64k-td-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_italian_uncased_osiria_it.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_italian_uncased_osiria_it.md new file mode 100644 index 00000000000000..96dfd316df2714 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_italian_uncased_osiria_it.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Italian bert_base_italian_uncased_osiria BertEmbeddings from osiria +author: John Snow Labs +name: bert_base_italian_uncased_osiria +date: 2023-09-13 +tags: [bert, it, open_source, fill_mask, onnx] +task: Embeddings +language: it +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_italian_uncased_osiria` is a Italian model originally trained by osiria. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_italian_uncased_osiria_it_5.1.1_3.0_1694576263884.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_italian_uncased_osiria_it_5.1.1_3.0_1694576263884.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_italian_uncased_osiria","it") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_italian_uncased_osiria", "it") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_italian_uncased_osiria| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|it| +|Size:|407.1 MB| + +## References + +https://huggingface.co/osiria/bert-base-italian-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_japanese_ssuw_ja.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_japanese_ssuw_ja.md new file mode 100644 index 00000000000000..a423f277841279 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_japanese_ssuw_ja.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Japanese bert_base_japanese_ssuw BertEmbeddings from ku-accms +author: John Snow Labs +name: bert_base_japanese_ssuw +date: 2023-09-13 +tags: [bert, ja, open_source, fill_mask, onnx] +task: Embeddings +language: ja +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_japanese_ssuw` is a Japanese model originally trained by ku-accms. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_japanese_ssuw_ja_5.1.1_3.0_1694642519721.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_japanese_ssuw_ja_5.1.1_3.0_1694642519721.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_japanese_ssuw","ja") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_japanese_ssuw", "ja") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_japanese_ssuw| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|ja| +|Size:|412.0 MB| + +## References + +https://huggingface.co/ku-accms/bert-base-japanese-ssuw \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_multilingual_cased_finetuned_lener_breton_xx.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_multilingual_cased_finetuned_lener_breton_xx.md new file mode 100644 index 00000000000000..40d3a6bee214e8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_multilingual_cased_finetuned_lener_breton_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_finetuned_lener_breton BertEmbeddings from Luciano +author: John Snow Labs +name: bert_base_multilingual_cased_finetuned_lener_breton +date: 2023-09-13 +tags: [bert, xx, open_source, fill_mask, onnx] +task: Embeddings +language: xx +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_finetuned_lener_breton` is a Multilingual model originally trained by Luciano. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_lener_breton_xx_5.1.1_3.0_1694617850537.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_lener_breton_xx_5.1.1_3.0_1694617850537.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_multilingual_cased_finetuned_lener_breton","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_multilingual_cased_finetuned_lener_breton", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_finetuned_lener_breton| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|xx| +|Size:|665.1 MB| + +## References + +https://huggingface.co/Luciano/bert-base-multilingual-cased-finetuned-lener_br \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_multilingual_cased_iwslt14deen_xx.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_multilingual_cased_iwslt14deen_xx.md new file mode 100644 index 00000000000000..5609e17478e5a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_multilingual_cased_iwslt14deen_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_iwslt14deen BertEmbeddings from miugod +author: John Snow Labs +name: bert_base_multilingual_cased_iwslt14deen +date: 2023-09-13 +tags: [bert, xx, open_source, fill_mask, onnx] +task: Embeddings +language: xx +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_iwslt14deen` is a Multilingual model originally trained by miugod. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_iwslt14deen_xx_5.1.1_3.0_1694598191715.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_iwslt14deen_xx_5.1.1_3.0_1694598191715.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_multilingual_cased_iwslt14deen","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_multilingual_cased_iwslt14deen", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_iwslt14deen| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|xx| +|Size:|664.2 MB| + +## References + +https://huggingface.co/miugod/bert-base-multilingual-cased-iwslt14deen \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_multilingual_uncased_pretrained_xx.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_multilingual_uncased_pretrained_xx.md new file mode 100644 index 00000000000000..925cbcc12df313 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_multilingual_uncased_pretrained_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_uncased_pretrained BertEmbeddings from am-shb +author: John Snow Labs +name: bert_base_multilingual_uncased_pretrained +date: 2023-09-13 +tags: [bert, xx, open_source, fill_mask, onnx] +task: Embeddings +language: xx +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_uncased_pretrained` is a Multilingual model originally trained by am-shb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_uncased_pretrained_xx_5.1.1_3.0_1694579516141.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_uncased_pretrained_xx_5.1.1_3.0_1694579516141.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_multilingual_uncased_pretrained","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_multilingual_uncased_pretrained", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_uncased_pretrained| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|xx| +|Size:|625.5 MB| + +## References + +https://huggingface.co/am-shb/bert-base-multilingual-uncased-pretrained \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_pashto_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_pashto_en.md new file mode 100644 index 00000000000000..79e6809ee1b7b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_pashto_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_pashto BertEmbeddings from ijazulhaq +author: John Snow Labs +name: bert_base_pashto +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_pashto` is a English model originally trained by ijazulhaq. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_pashto_en_5.1.1_3.0_1694628431394.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_pashto_en_5.1.1_3.0_1694628431394.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_pashto","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_pashto", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_pashto| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|406.4 MB| + +## References + +https://huggingface.co/ijazulhaq/bert-base-pashto \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_pashto_v1_ps.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_pashto_v1_ps.md new file mode 100644 index 00000000000000..b618258ebd17d2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_pashto_v1_ps.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Pashto, Pushto bert_base_pashto_v1 BertEmbeddings from ijazulhaq +author: John Snow Labs +name: bert_base_pashto_v1 +date: 2023-09-13 +tags: [bert, ps, open_source, fill_mask, onnx] +task: Embeddings +language: ps +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_pashto_v1` is a Pashto, Pushto model originally trained by ijazulhaq. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_pashto_v1_ps_5.1.1_3.0_1694648224326.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_pashto_v1_ps_5.1.1_3.0_1694648224326.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_pashto_v1","ps") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_pashto_v1", "ps") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_pashto_v1| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|ps| +|Size:|406.3 MB| + +## References + +https://huggingface.co/ijazulhaq/bert-base-pashto-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_polish_uncased_v1_pl.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_polish_uncased_v1_pl.md new file mode 100644 index 00000000000000..8f0551a218592c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_polish_uncased_v1_pl.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Polish bert_base_polish_uncased_v1 BertEmbeddings from dkleczek +author: John Snow Labs +name: bert_base_polish_uncased_v1 +date: 2023-09-13 +tags: [bert, pl, open_source, fill_mask, onnx] +task: Embeddings +language: pl +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_polish_uncased_v1` is a Polish model originally trained by dkleczek. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_polish_uncased_v1_pl_5.1.1_3.0_1694625516398.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_polish_uncased_v1_pl_5.1.1_3.0_1694625516398.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_polish_uncased_v1","pl") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_polish_uncased_v1", "pl") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_polish_uncased_v1| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|pl| +|Size:|493.5 MB| + +## References + +https://huggingface.co/dkleczek/bert-base-polish-uncased-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_portuguese_cased_finetuned_enjoei_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_portuguese_cased_finetuned_enjoei_en.md new file mode 100644 index 00000000000000..2fe536e596bd4d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_portuguese_cased_finetuned_enjoei_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_portuguese_cased_finetuned_enjoei BertEmbeddings from gabrielgmendonca +author: John Snow Labs +name: bert_base_portuguese_cased_finetuned_enjoei +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_portuguese_cased_finetuned_enjoei` is a English model originally trained by gabrielgmendonca. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_portuguese_cased_finetuned_enjoei_en_5.1.1_3.0_1694618398309.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_portuguese_cased_finetuned_enjoei_en_5.1.1_3.0_1694618398309.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_portuguese_cased_finetuned_enjoei","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_portuguese_cased_finetuned_enjoei", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_portuguese_cased_finetuned_enjoei| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/gabrielgmendonca/bert-base-portuguese-cased-finetuned-enjoei \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_romanian_cased_v1_ro.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_romanian_cased_v1_ro.md new file mode 100644 index 00000000000000..e92c70cd72616d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_romanian_cased_v1_ro.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Moldavian, Moldovan, Romanian bert_base_romanian_cased_v1 BertEmbeddings from dumitrescustefan +author: John Snow Labs +name: bert_base_romanian_cased_v1 +date: 2023-09-13 +tags: [bert, ro, open_source, fill_mask, onnx] +task: Embeddings +language: ro +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_romanian_cased_v1` is a Moldavian, Moldovan, Romanian model originally trained by dumitrescustefan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_romanian_cased_v1_ro_5.1.1_3.0_1694627761142.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_romanian_cased_v1_ro_5.1.1_3.0_1694627761142.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_romanian_cased_v1","ro") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_romanian_cased_v1", "ro") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_romanian_cased_v1| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|ro| +|Size:|464.0 MB| + +## References + +https://huggingface.co/dumitrescustefan/bert-base-romanian-cased-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_romanian_uncased_v1_ro.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_romanian_uncased_v1_ro.md new file mode 100644 index 00000000000000..342db426389a75 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_romanian_uncased_v1_ro.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Moldavian, Moldovan, Romanian bert_base_romanian_uncased_v1 BertEmbeddings from dumitrescustefan +author: John Snow Labs +name: bert_base_romanian_uncased_v1 +date: 2023-09-13 +tags: [bert, ro, open_source, fill_mask, onnx] +task: Embeddings +language: ro +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_romanian_uncased_v1` is a Moldavian, Moldovan, Romanian model originally trained by dumitrescustefan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_romanian_uncased_v1_ro_5.1.1_3.0_1694628307563.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_romanian_uncased_v1_ro_5.1.1_3.0_1694628307563.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_romanian_uncased_v1","ro") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_romanian_uncased_v1", "ro") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_romanian_uncased_v1| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|ro| +|Size:|464.4 MB| + +## References + +https://huggingface.co/dumitrescustefan/bert-base-romanian-uncased-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_spanish_amvv_uncased_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_spanish_amvv_uncased_en.md new file mode 100644 index 00000000000000..da6672cd276449 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_spanish_amvv_uncased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_spanish_amvv_uncased BertEmbeddings from amvargasv +author: John Snow Labs +name: bert_base_spanish_amvv_uncased +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_spanish_amvv_uncased` is a English model originally trained by amvargasv. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_spanish_amvv_uncased_en_5.1.1_3.0_1694647970479.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_spanish_amvv_uncased_en_5.1.1_3.0_1694647970479.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_spanish_amvv_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_spanish_amvv_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_spanish_amvv_uncased| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/amvargasv/bert-base-spanish-amvv-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_stackoverflow_comments_1m_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_stackoverflow_comments_1m_en.md new file mode 100644 index 00000000000000..67b2321f93e974 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_stackoverflow_comments_1m_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_stackoverflow_comments_1m BertEmbeddings from giganticode +author: John Snow Labs +name: bert_base_stackoverflow_comments_1m +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_stackoverflow_comments_1m` is a English model originally trained by giganticode. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_stackoverflow_comments_1m_en_5.1.1_3.0_1694648985744.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_stackoverflow_comments_1m_en_5.1.1_3.0_1694648985744.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_stackoverflow_comments_1m","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_stackoverflow_comments_1m", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_stackoverflow_comments_1m| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.0 MB| + +## References + +https://huggingface.co/giganticode/bert-base-StackOverflow-comments_1M \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_stackoverflow_comments_2m_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_stackoverflow_comments_2m_en.md new file mode 100644 index 00000000000000..083c6b57462ac3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_stackoverflow_comments_2m_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_stackoverflow_comments_2m BertEmbeddings from giganticode +author: John Snow Labs +name: bert_base_stackoverflow_comments_2m +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_stackoverflow_comments_2m` is a English model originally trained by giganticode. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_stackoverflow_comments_2m_en_5.1.1_3.0_1694649245162.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_stackoverflow_comments_2m_en_5.1.1_3.0_1694649245162.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_stackoverflow_comments_2m","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_stackoverflow_comments_2m", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_stackoverflow_comments_2m| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.0 MB| + +## References + +https://huggingface.co/giganticode/bert-base-StackOverflow-comments_2M \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_swedish_cased_alpha_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_swedish_cased_alpha_en.md new file mode 100644 index 00000000000000..ab636f8c757459 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_swedish_cased_alpha_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_swedish_cased_alpha BertEmbeddings from KBLab +author: John Snow Labs +name: bert_base_swedish_cased_alpha +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_swedish_cased_alpha` is a English model originally trained by KBLab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_swedish_cased_alpha_en_5.1.1_3.0_1694563762856.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_swedish_cased_alpha_en_5.1.1_3.0_1694563762856.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_swedish_cased_alpha","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_swedish_cased_alpha", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_swedish_cased_alpha| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|464.2 MB| + +## References + +https://huggingface.co/KBLab/bert-base-swedish-cased-alpha \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_token_dropping_dewiki_v1_de.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_token_dropping_dewiki_v1_de.md new file mode 100644 index 00000000000000..d598a22adbfa34 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_token_dropping_dewiki_v1_de.md @@ -0,0 +1,93 @@ +--- +layout: model +title: German bert_base_token_dropping_dewiki_v1 BertEmbeddings from gwlms +author: John Snow Labs +name: bert_base_token_dropping_dewiki_v1 +date: 2023-09-13 +tags: [bert, de, open_source, fill_mask, onnx] +task: Embeddings +language: de +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_token_dropping_dewiki_v1` is a German model originally trained by gwlms. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_token_dropping_dewiki_v1_de_5.1.1_3.0_1694624494518.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_token_dropping_dewiki_v1_de_5.1.1_3.0_1694624494518.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_token_dropping_dewiki_v1","de") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_token_dropping_dewiki_v1", "de") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_token_dropping_dewiki_v1| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|de| +|Size:|412.4 MB| + +## References + +https://huggingface.co/gwlms/bert-base-token-dropping-dewiki-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_turkish_cased_offensive_mlm_tr.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_turkish_cased_offensive_mlm_tr.md new file mode 100644 index 00000000000000..c0763e2b347050 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_turkish_cased_offensive_mlm_tr.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Turkish bert_base_turkish_cased_offensive_mlm BertEmbeddings from Overfit-GM +author: John Snow Labs +name: bert_base_turkish_cased_offensive_mlm +date: 2023-09-13 +tags: [bert, tr, open_source, fill_mask, onnx] +task: Embeddings +language: tr +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_turkish_cased_offensive_mlm` is a Turkish model originally trained by Overfit-GM. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_turkish_cased_offensive_mlm_tr_5.1.1_3.0_1694608252958.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_turkish_cased_offensive_mlm_tr_5.1.1_3.0_1694608252958.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_turkish_cased_offensive_mlm","tr") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_turkish_cased_offensive_mlm", "tr") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_turkish_cased_offensive_mlm| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|tr| +|Size:|412.3 MB| + +## References + +https://huggingface.co/Overfit-GM/bert-base-turkish-cased-offensive-mlm \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_2_finetuned_rramicus_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_2_finetuned_rramicus_en.md new file mode 100644 index 00000000000000..de88314f4642ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_2_finetuned_rramicus_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_2_finetuned_rramicus BertEmbeddings from repro-rights-amicus-briefs +author: John Snow Labs +name: bert_base_uncased_2_finetuned_rramicus +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_2_finetuned_rramicus` is a English model originally trained by repro-rights-amicus-briefs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_2_finetuned_rramicus_en_5.1.1_3.0_1694636800409.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_2_finetuned_rramicus_en_5.1.1_3.0_1694636800409.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_2_finetuned_rramicus","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_2_finetuned_rramicus", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_2_finetuned_rramicus| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/repro-rights-amicus-briefs/bert-base-uncased-2-finetuned-RRamicus \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_ancient_greek_v1_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_ancient_greek_v1_en.md new file mode 100644 index 00000000000000..89129e45c8ba2b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_ancient_greek_v1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_ancient_greek_v1 BertEmbeddings from Sonnenblume +author: John Snow Labs +name: bert_base_uncased_ancient_greek_v1 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_ancient_greek_v1` is a English model originally trained by Sonnenblume. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_ancient_greek_v1_en_5.1.1_3.0_1694616781198.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_ancient_greek_v1_en_5.1.1_3.0_1694616781198.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_ancient_greek_v1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_ancient_greek_v1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_ancient_greek_v1| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|421.0 MB| + +## References + +https://huggingface.co/Sonnenblume/bert-base-uncased-ancient-greek-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_ancient_greek_v3_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_ancient_greek_v3_en.md new file mode 100644 index 00000000000000..19179b1becd45f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_ancient_greek_v3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_ancient_greek_v3 BertEmbeddings from Sonnenblume +author: John Snow Labs +name: bert_base_uncased_ancient_greek_v3 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_ancient_greek_v3` is a English model originally trained by Sonnenblume. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_ancient_greek_v3_en_5.1.1_3.0_1694617164732.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_ancient_greek_v3_en_5.1.1_3.0_1694617164732.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_ancient_greek_v3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_ancient_greek_v3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_ancient_greek_v3| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|421.0 MB| + +## References + +https://huggingface.co/Sonnenblume/bert-base-uncased-ancient-greek-v3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_ancient_greek_v4_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_ancient_greek_v4_en.md new file mode 100644 index 00000000000000..01260c6d80d789 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_ancient_greek_v4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_ancient_greek_v4 BertEmbeddings from Sonnenblume +author: John Snow Labs +name: bert_base_uncased_ancient_greek_v4 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_ancient_greek_v4` is a English model originally trained by Sonnenblume. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_ancient_greek_v4_en_5.1.1_3.0_1694636309967.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_ancient_greek_v4_en_5.1.1_3.0_1694636309967.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_ancient_greek_v4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_ancient_greek_v4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_ancient_greek_v4| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|421.0 MB| + +## References + +https://huggingface.co/Sonnenblume/bert-base-uncased-ancient-greek-v4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_bert_auto5_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_bert_auto5_en.md new file mode 100644 index 00000000000000..ed245ba4391507 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_bert_auto5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_bert_auto5 BertEmbeddings from himanimaheshwari3 +author: John Snow Labs +name: bert_base_uncased_finetuned_bert_auto5 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_bert_auto5` is a English model originally trained by himanimaheshwari3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_bert_auto5_en_5.1.1_3.0_1694638038292.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_bert_auto5_en_5.1.1_3.0_1694638038292.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_bert_auto5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_bert_auto5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_bert_auto5| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/himanimaheshwari3/bert-base-uncased-finetuned-bert-auto5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_bert_auto6_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_bert_auto6_en.md new file mode 100644 index 00000000000000..7fab8b7bb60418 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_bert_auto6_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_bert_auto6 BertEmbeddings from himanimaheshwari3 +author: John Snow Labs +name: bert_base_uncased_finetuned_bert_auto6 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_bert_auto6` is a English model originally trained by himanimaheshwari3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_bert_auto6_en_5.1.1_3.0_1694638484821.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_bert_auto6_en_5.1.1_3.0_1694638484821.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_bert_auto6","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_bert_auto6", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_bert_auto6| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/himanimaheshwari3/bert-base-uncased-finetuned-bert-auto6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_bert_auto7_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_bert_auto7_en.md new file mode 100644 index 00000000000000..cee0b7df827814 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_bert_auto7_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_bert_auto7 BertEmbeddings from himanimaheshwari3 +author: John Snow Labs +name: bert_base_uncased_finetuned_bert_auto7 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_bert_auto7` is a English model originally trained by himanimaheshwari3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_bert_auto7_en_5.1.1_3.0_1694638942003.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_bert_auto7_en_5.1.1_3.0_1694638942003.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_bert_auto7","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_bert_auto7", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_bert_auto7| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/himanimaheshwari3/bert-base-uncased-finetuned-bert-auto7 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_bert_auto_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_bert_auto_en.md new file mode 100644 index 00000000000000..93b436cea9e9b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_bert_auto_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_bert_auto BertEmbeddings from himanimaheshwari3 +author: John Snow Labs +name: bert_base_uncased_finetuned_bert_auto +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_bert_auto` is a English model originally trained by himanimaheshwari3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_bert_auto_en_5.1.1_3.0_1694637609034.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_bert_auto_en_5.1.1_3.0_1694637609034.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_bert_auto","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_bert_auto", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_bert_auto| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/himanimaheshwari3/bert-base-uncased-finetuned-bert-auto \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_bert_mlm9_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_bert_mlm9_en.md new file mode 100644 index 00000000000000..d8c678257f171b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_bert_mlm9_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_bert_mlm9 BertEmbeddings from himanimaheshwari3 +author: John Snow Labs +name: bert_base_uncased_finetuned_bert_mlm9 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_bert_mlm9` is a English model originally trained by himanimaheshwari3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_bert_mlm9_en_5.1.1_3.0_1694643632914.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_bert_mlm9_en_5.1.1_3.0_1694643632914.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_bert_mlm9","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_bert_mlm9", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_bert_mlm9| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/himanimaheshwari3/bert-base-uncased-finetuned-BERT-mlm9 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_bert_mlm_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_bert_mlm_en.md new file mode 100644 index 00000000000000..e238a28cfb0d90 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_bert_mlm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_bert_mlm BertEmbeddings from himanimaheshwari3 +author: John Snow Labs +name: bert_base_uncased_finetuned_bert_mlm +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_bert_mlm` is a English model originally trained by himanimaheshwari3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_bert_mlm_en_5.1.1_3.0_1694640077705.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_bert_mlm_en_5.1.1_3.0_1694640077705.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_bert_mlm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_bert_mlm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_bert_mlm| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/himanimaheshwari3/bert-base-uncased-finetuned-bert-mlm \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_eva_accelerate_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_eva_accelerate_en.md new file mode 100644 index 00000000000000..ad0bc6388cdfad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_eva_accelerate_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_eva_accelerate BertEmbeddings from CesarLeblanc +author: John Snow Labs +name: bert_base_uncased_finetuned_eva_accelerate +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_eva_accelerate` is a English model originally trained by CesarLeblanc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_eva_accelerate_en_5.1.1_3.0_1694616995090.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_eva_accelerate_en_5.1.1_3.0_1694616995090.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_eva_accelerate","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_eva_accelerate", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_eva_accelerate| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|440.6 MB| + +## References + +https://huggingface.co/CesarLeblanc/bert-base-uncased-finetuned-eva-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_eva_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_eva_en.md new file mode 100644 index 00000000000000..7f40930725eb5a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_eva_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_eva BertEmbeddings from CesarLeblanc +author: John Snow Labs +name: bert_base_uncased_finetuned_eva +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_eva` is a English model originally trained by CesarLeblanc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_eva_en_5.1.1_3.0_1694616645644.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_eva_en_5.1.1_3.0_1694616645644.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_eva","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_eva", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_eva| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|440.7 MB| + +## References + +https://huggingface.co/CesarLeblanc/bert-base-uncased-finetuned-eva \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_himani_auto3_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_himani_auto3_en.md new file mode 100644 index 00000000000000..1f5ca3901ccdd8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_himani_auto3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_himani_auto3 BertEmbeddings from himanimaheshwari3 +author: John Snow Labs +name: bert_base_uncased_finetuned_himani_auto3 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_himani_auto3` is a English model originally trained by himanimaheshwari3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_himani_auto3_en_5.1.1_3.0_1694637243769.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_himani_auto3_en_5.1.1_3.0_1694637243769.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_himani_auto3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_himani_auto3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_himani_auto3| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/himanimaheshwari3/bert-base-uncased-finetuned-himani-auto3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_himani_gen_mlm_12_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_himani_gen_mlm_12_en.md new file mode 100644 index 00000000000000..30a52eb8ef2ce8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_himani_gen_mlm_12_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_himani_gen_mlm_12 BertEmbeddings from himanimaheshwari3 +author: John Snow Labs +name: bert_base_uncased_finetuned_himani_gen_mlm_12 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_himani_gen_mlm_12` is a English model originally trained by himanimaheshwari3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_himani_gen_mlm_12_en_5.1.1_3.0_1694646419148.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_himani_gen_mlm_12_en_5.1.1_3.0_1694646419148.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_himani_gen_mlm_12","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_himani_gen_mlm_12", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_himani_gen_mlm_12| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/himanimaheshwari3/bert-base-uncased-finetuned-himani-gen-MLM-12 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_himani_gen_mlm_13_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_himani_gen_mlm_13_en.md new file mode 100644 index 00000000000000..025b90f82be72c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_himani_gen_mlm_13_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_himani_gen_mlm_13 BertEmbeddings from himanimaheshwari3 +author: John Snow Labs +name: bert_base_uncased_finetuned_himani_gen_mlm_13 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_himani_gen_mlm_13` is a English model originally trained by himanimaheshwari3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_himani_gen_mlm_13_en_5.1.1_3.0_1694647435616.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_himani_gen_mlm_13_en_5.1.1_3.0_1694647435616.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_himani_gen_mlm_13","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_himani_gen_mlm_13", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_himani_gen_mlm_13| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/himanimaheshwari3/bert-base-uncased-finetuned-himani-gen-MLM-13 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_himani_gen_mlm_14_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_himani_gen_mlm_14_en.md new file mode 100644 index 00000000000000..1f93fa43a398d7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_himani_gen_mlm_14_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_himani_gen_mlm_14 BertEmbeddings from himanimaheshwari3 +author: John Snow Labs +name: bert_base_uncased_finetuned_himani_gen_mlm_14 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_himani_gen_mlm_14` is a English model originally trained by himanimaheshwari3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_himani_gen_mlm_14_en_5.1.1_3.0_1694647735738.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_himani_gen_mlm_14_en_5.1.1_3.0_1694647735738.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_himani_gen_mlm_14","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_himani_gen_mlm_14", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_himani_gen_mlm_14| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/himanimaheshwari3/bert-base-uncased-finetuned-himani-gen-MLM-14 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_himani_gen_mlm_15_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_himani_gen_mlm_15_en.md new file mode 100644 index 00000000000000..5d94900e2a8e98 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_himani_gen_mlm_15_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_himani_gen_mlm_15 BertEmbeddings from himanimaheshwari3 +author: John Snow Labs +name: bert_base_uncased_finetuned_himani_gen_mlm_15 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_himani_gen_mlm_15` is a English model originally trained by himanimaheshwari3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_himani_gen_mlm_15_en_5.1.1_3.0_1694648220318.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_himani_gen_mlm_15_en_5.1.1_3.0_1694648220318.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_himani_gen_mlm_15","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_himani_gen_mlm_15", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_himani_gen_mlm_15| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/himanimaheshwari3/bert-base-uncased-finetuned-himani-gen-MLM-15 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_himani_gen_mlm_1_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_himani_gen_mlm_1_en.md new file mode 100644 index 00000000000000..eec8122a404e29 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_himani_gen_mlm_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_himani_gen_mlm_1 BertEmbeddings from himanimaheshwari3 +author: John Snow Labs +name: bert_base_uncased_finetuned_himani_gen_mlm_1 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_himani_gen_mlm_1` is a English model originally trained by himanimaheshwari3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_himani_gen_mlm_1_en_5.1.1_3.0_1694645767215.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_himani_gen_mlm_1_en_5.1.1_3.0_1694645767215.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_himani_gen_mlm_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_himani_gen_mlm_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_himani_gen_mlm_1| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/himanimaheshwari3/bert-base-uncased-finetuned-himani-gen-MLM-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_himani_gen_mlm_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_himani_gen_mlm_en.md new file mode 100644 index 00000000000000..cb4afbfc9e3672 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_himani_gen_mlm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_himani_gen_mlm BertEmbeddings from himanimaheshwari3 +author: John Snow Labs +name: bert_base_uncased_finetuned_himani_gen_mlm +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_himani_gen_mlm` is a English model originally trained by himanimaheshwari3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_himani_gen_mlm_en_5.1.1_3.0_1694645433865.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_himani_gen_mlm_en_5.1.1_3.0_1694645433865.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_himani_gen_mlm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_himani_gen_mlm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_himani_gen_mlm| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/himanimaheshwari3/bert-base-uncased-finetuned-himani-gen-MLM \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_himani_n_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_himani_n_en.md new file mode 100644 index 00000000000000..c7511dadb718eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_himani_n_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_himani_n BertEmbeddings from himanimaheshwari3 +author: John Snow Labs +name: bert_base_uncased_finetuned_himani_n +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_himani_n` is a English model originally trained by himanimaheshwari3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_himani_n_en_5.1.1_3.0_1694636833361.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_himani_n_en_5.1.1_3.0_1694636833361.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_himani_n","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_himani_n", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_himani_n| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/himanimaheshwari3/bert-base-uncased-finetuned-himani-n \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_imdb_jasheu_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_imdb_jasheu_en.md new file mode 100644 index 00000000000000..847243b5266b65 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_imdb_jasheu_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_imdb_jasheu BertEmbeddings from jasheu +author: John Snow Labs +name: bert_base_uncased_finetuned_imdb_jasheu +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_imdb_jasheu` is a English model originally trained by jasheu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_imdb_jasheu_en_5.1.1_3.0_1694624085789.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_imdb_jasheu_en_5.1.1_3.0_1694624085789.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_imdb_jasheu","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_imdb_jasheu", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_imdb_jasheu| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/jasheu/bert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_imdb_medhabi_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_imdb_medhabi_en.md new file mode 100644 index 00000000000000..9c8c1789981c46 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_imdb_medhabi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_imdb_medhabi BertEmbeddings from medhabi +author: John Snow Labs +name: bert_base_uncased_finetuned_imdb_medhabi +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_imdb_medhabi` is a English model originally trained by medhabi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_imdb_medhabi_en_5.1.1_3.0_1694626974894.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_imdb_medhabi_en_5.1.1_3.0_1694626974894.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_imdb_medhabi","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_imdb_medhabi", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_imdb_medhabi| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/medhabi/bert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_imdb_sarmila_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_imdb_sarmila_en.md new file mode 100644 index 00000000000000..0451f1a577d372 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_finetuned_imdb_sarmila_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_imdb_sarmila BertEmbeddings from Sarmila +author: John Snow Labs +name: bert_base_uncased_finetuned_imdb_sarmila +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_imdb_sarmila` is a English model originally trained by Sarmila. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_imdb_sarmila_en_5.1.1_3.0_1694641541584.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_imdb_sarmila_en_5.1.1_3.0_1694641541584.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_imdb_sarmila","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_imdb_sarmila", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_imdb_sarmila| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/Sarmila/bert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_antoinev17_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_antoinev17_en.md new file mode 100644 index 00000000000000..6c6943aaa25509 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_antoinev17_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_issues_128_antoinev17 BertEmbeddings from antoinev17 +author: John Snow Labs +name: bert_base_uncased_issues_128_antoinev17 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_issues_128_antoinev17` is a English model originally trained by antoinev17. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_antoinev17_en_5.1.1_3.0_1694634951894.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_antoinev17_en_5.1.1_3.0_1694634951894.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_issues_128_antoinev17","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_issues_128_antoinev17", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_issues_128_antoinev17| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/antoinev17/bert-base-uncased-issues-128 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_betelgeux_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_betelgeux_en.md new file mode 100644 index 00000000000000..8cc07f931d8b7d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_betelgeux_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_issues_128_betelgeux BertEmbeddings from betelgeux +author: John Snow Labs +name: bert_base_uncased_issues_128_betelgeux +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_issues_128_betelgeux` is a English model originally trained by betelgeux. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_betelgeux_en_5.1.1_3.0_1694627127578.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_betelgeux_en_5.1.1_3.0_1694627127578.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_issues_128_betelgeux","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_issues_128_betelgeux", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_issues_128_betelgeux| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.0 MB| + +## References + +https://huggingface.co/betelgeux/bert-base-uncased-issues-128 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_cj_mills_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_cj_mills_en.md new file mode 100644 index 00000000000000..bed15b90576773 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_cj_mills_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_issues_128_cj_mills BertEmbeddings from cj-mills +author: John Snow Labs +name: bert_base_uncased_issues_128_cj_mills +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_issues_128_cj_mills` is a English model originally trained by cj-mills. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_cj_mills_en_5.1.1_3.0_1694645354456.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_cj_mills_en_5.1.1_3.0_1694645354456.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_issues_128_cj_mills","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_issues_128_cj_mills", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_issues_128_cj_mills| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/cj-mills/bert-base-uncased-issues-128 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_frahman_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_frahman_en.md new file mode 100644 index 00000000000000..ffb41eba22f342 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_frahman_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_issues_128_frahman BertEmbeddings from frahman +author: John Snow Labs +name: bert_base_uncased_issues_128_frahman +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_issues_128_frahman` is a English model originally trained by frahman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_frahman_en_5.1.1_3.0_1694627458573.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_frahman_en_5.1.1_3.0_1694627458573.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_issues_128_frahman","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_issues_128_frahman", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_issues_128_frahman| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/frahman/bert-base-uncased-issues-128 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_hanwoon_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_hanwoon_en.md new file mode 100644 index 00000000000000..45709508e28952 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_hanwoon_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_issues_128_hanwoon BertEmbeddings from Hanwoon +author: John Snow Labs +name: bert_base_uncased_issues_128_hanwoon +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_issues_128_hanwoon` is a English model originally trained by Hanwoon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_hanwoon_en_5.1.1_3.0_1694619277361.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_hanwoon_en_5.1.1_3.0_1694619277361.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_issues_128_hanwoon","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_issues_128_hanwoon", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_issues_128_hanwoon| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/Hanwoon/bert-base-uncased-issues-128 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_hudee_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_hudee_en.md new file mode 100644 index 00000000000000..f87d854ac2f1fc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_hudee_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_issues_128_hudee BertEmbeddings from Hudee +author: John Snow Labs +name: bert_base_uncased_issues_128_hudee +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_issues_128_hudee` is a English model originally trained by Hudee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_hudee_en_5.1.1_3.0_1694637838114.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_hudee_en_5.1.1_3.0_1694637838114.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_issues_128_hudee","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_issues_128_hudee", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_issues_128_hudee| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/Hudee/bert-base-uncased-issues-128 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_juandeun_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_juandeun_en.md new file mode 100644 index 00000000000000..f6843fa12ad757 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_juandeun_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_issues_128_juandeun BertEmbeddings from juandeun +author: John Snow Labs +name: bert_base_uncased_issues_128_juandeun +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_issues_128_juandeun` is a English model originally trained by juandeun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_juandeun_en_5.1.1_3.0_1694600405699.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_juandeun_en_5.1.1_3.0_1694600405699.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_issues_128_juandeun","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_issues_128_juandeun", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_issues_128_juandeun| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/juandeun/bert-base-uncased-issues-128 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_kiki2013_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_kiki2013_en.md new file mode 100644 index 00000000000000..7858ab760fd57f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_kiki2013_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_issues_128_kiki2013 BertEmbeddings from kiki2013 +author: John Snow Labs +name: bert_base_uncased_issues_128_kiki2013 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_issues_128_kiki2013` is a English model originally trained by kiki2013. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_kiki2013_en_5.1.1_3.0_1694609694156.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_kiki2013_en_5.1.1_3.0_1694609694156.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_issues_128_kiki2013","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_issues_128_kiki2013", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_issues_128_kiki2013| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/kiki2013/bert-base-uncased-issues-128 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_lijingxin_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_lijingxin_en.md new file mode 100644 index 00000000000000..0a05f166fad5ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_lijingxin_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_issues_128_lijingxin BertEmbeddings from lijingxin +author: John Snow Labs +name: bert_base_uncased_issues_128_lijingxin +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_issues_128_lijingxin` is a English model originally trained by lijingxin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_lijingxin_en_5.1.1_3.0_1694611963447.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_lijingxin_en_5.1.1_3.0_1694611963447.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_issues_128_lijingxin","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_issues_128_lijingxin", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_issues_128_lijingxin| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/lijingxin/bert-base-uncased-issues-128 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_martinwunderlich_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_martinwunderlich_en.md new file mode 100644 index 00000000000000..d2b2cd560a2ea8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_martinwunderlich_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_issues_128_martinwunderlich BertEmbeddings from martinwunderlich +author: John Snow Labs +name: bert_base_uncased_issues_128_martinwunderlich +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_issues_128_martinwunderlich` is a English model originally trained by martinwunderlich. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_martinwunderlich_en_5.1.1_3.0_1694633945364.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_martinwunderlich_en_5.1.1_3.0_1694633945364.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_issues_128_martinwunderlich","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_issues_128_martinwunderlich", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_issues_128_martinwunderlich| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/martinwunderlich/bert-base-uncased-issues-128 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_olpa_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_olpa_en.md new file mode 100644 index 00000000000000..33466e6cf42ab6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_issues_128_olpa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_issues_128_olpa BertEmbeddings from olpa +author: John Snow Labs +name: bert_base_uncased_issues_128_olpa +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_issues_128_olpa` is a English model originally trained by olpa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_olpa_en_5.1.1_3.0_1694631357838.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_olpa_en_5.1.1_3.0_1694631357838.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_issues_128_olpa","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_issues_128_olpa", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_issues_128_olpa| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/olpa/bert-base-uncased-issues-128 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_mlm_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_mlm_en.md new file mode 100644 index 00000000000000..63f27883f83cc0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_mlm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_mlm BertEmbeddings from wypoon +author: John Snow Labs +name: bert_base_uncased_mlm +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_mlm` is a English model originally trained by wypoon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mlm_en_5.1.1_3.0_1694614779984.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mlm_en_5.1.1_3.0_1694614779984.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_mlm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_mlm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_mlm| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/wypoon/bert-base-uncased-mlm \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_model_attribution_challenge_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_model_attribution_challenge_en.md new file mode 100644 index 00000000000000..aea6646fb8eb77 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_model_attribution_challenge_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_model_attribution_challenge BertEmbeddings from model-attribution-challenge +author: John Snow Labs +name: bert_base_uncased_model_attribution_challenge +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_model_attribution_challenge` is a English model originally trained by model-attribution-challenge. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_model_attribution_challenge_en_5.1.1_3.0_1694628303662.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_model_attribution_challenge_en_5.1.1_3.0_1694628303662.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_model_attribution_challenge","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_model_attribution_challenge", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_model_attribution_challenge| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/model-attribution-challenge/bert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_multi_128_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_multi_128_en.md new file mode 100644 index 00000000000000..21d3f1a31b8aef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_multi_128_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_multi_128 BertEmbeddings from xxr +author: John Snow Labs +name: bert_base_uncased_multi_128 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_multi_128` is a English model originally trained by xxr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_multi_128_en_5.1.1_3.0_1694625544033.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_multi_128_en_5.1.1_3.0_1694625544033.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_multi_128","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_multi_128", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_multi_128| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/xxr/bert-base-uncased-multi-128 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_sparse_80_1x4_block_pruneofa_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_sparse_80_1x4_block_pruneofa_en.md new file mode 100644 index 00000000000000..498a7d19a4b591 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_sparse_80_1x4_block_pruneofa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_sparse_80_1x4_block_pruneofa BertEmbeddings from Intel +author: John Snow Labs +name: bert_base_uncased_sparse_80_1x4_block_pruneofa +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_sparse_80_1x4_block_pruneofa` is a English model originally trained by Intel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sparse_80_1x4_block_pruneofa_en_5.1.1_3.0_1694621945507.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sparse_80_1x4_block_pruneofa_en_5.1.1_3.0_1694621945507.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_sparse_80_1x4_block_pruneofa","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_sparse_80_1x4_block_pruneofa", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_sparse_80_1x4_block_pruneofa| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|195.0 MB| + +## References + +https://huggingface.co/Intel/bert-base-uncased-sparse-80-1x4-block-pruneofa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_swahili_sw.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_swahili_sw.md new file mode 100644 index 00000000000000..38c58fe8d9941e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_uncased_swahili_sw.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Swahili (macrolanguage) bert_base_uncased_swahili BertEmbeddings from flax-community +author: John Snow Labs +name: bert_base_uncased_swahili +date: 2023-09-13 +tags: [bert, sw, open_source, fill_mask, onnx] +task: Embeddings +language: sw +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_swahili` is a Swahili (macrolanguage) model originally trained by flax-community. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_swahili_sw_5.1.1_3.0_1694642650442.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_swahili_sw_5.1.1_3.0_1694642650442.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_swahili","sw") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_swahili", "sw") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_swahili| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|sw| +|Size:|408.0 MB| + +## References + +https://huggingface.co/flax-community/bert-base-uncased-swahili \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_base_vn_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_base_vn_en.md new file mode 100644 index 00000000000000..4397c42a1ae458 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_base_vn_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_vn BertEmbeddings from NlpHUST +author: John Snow Labs +name: bert_base_vn +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_vn` is a English model originally trained by NlpHUST. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_vn_en_5.1.1_3.0_1694624251745.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_vn_en_5.1.1_3.0_1694624251745.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_vn","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_vn", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_vn| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|498.8 MB| + +## References + +https://huggingface.co/NlpHUST/bert-base-vn \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_based_answer_model_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_based_answer_model_en.md new file mode 100644 index 00000000000000..46cfd40d698bc4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_based_answer_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_based_answer_model BertEmbeddings from Kunjesh07 +author: John Snow Labs +name: bert_based_answer_model +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_based_answer_model` is a English model originally trained by Kunjesh07. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_based_answer_model_en_5.1.1_3.0_1694616027282.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_based_answer_model_en_5.1.1_3.0_1694616027282.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_based_answer_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_based_answer_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_based_answer_model| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/Kunjesh07/Bert-Based-Answer-Model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_cluster_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_cluster_en.md new file mode 100644 index 00000000000000..29a284c498eb04 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_cluster_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_cluster BertEmbeddings from mipatov +author: John Snow Labs +name: bert_cluster +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_cluster` is a English model originally trained by mipatov. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_cluster_en_5.1.1_3.0_1694647045460.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_cluster_en_5.1.1_3.0_1694647045460.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_cluster","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_cluster", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_cluster| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|664.9 MB| + +## References + +https://huggingface.co/mipatov/bert_cluster \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_fintuning_test1_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_fintuning_test1_en.md new file mode 100644 index 00000000000000..421adfc07a9f4f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_fintuning_test1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_fintuning_test1 BertEmbeddings from ZhaoyiGUAN +author: John Snow Labs +name: bert_fintuning_test1 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_fintuning_test1` is a English model originally trained by ZhaoyiGUAN. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_fintuning_test1_en_5.1.1_3.0_1694574944830.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_fintuning_test1_en_5.1.1_3.0_1694574944830.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_fintuning_test1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_fintuning_test1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_fintuning_test1| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/ZhaoyiGUAN/Bert_Fintuning_Test1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_kor_base_pz_language_test_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_kor_base_pz_language_test_en.md new file mode 100644 index 00000000000000..ec736e433c0d7f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_kor_base_pz_language_test_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_kor_base_pz_language_test BertEmbeddings from Hanwoon +author: John Snow Labs +name: bert_kor_base_pz_language_test +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_kor_base_pz_language_test` is a English model originally trained by Hanwoon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_kor_base_pz_language_test_en_5.1.1_3.0_1694620643379.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_kor_base_pz_language_test_en_5.1.1_3.0_1694620643379.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_kor_base_pz_language_test","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_kor_base_pz_language_test", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_kor_base_pz_language_test| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|441.1 MB| + +## References + +https://huggingface.co/Hanwoon/bert-kor-base-pz-language-test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_l12_h240_a12_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_l12_h240_a12_en.md new file mode 100644 index 00000000000000..17ce53d63b13f2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_l12_h240_a12_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_l12_h240_a12 BertEmbeddings from eli4s +author: John Snow Labs +name: bert_l12_h240_a12 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_l12_h240_a12` is a English model originally trained by eli4s. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_l12_h240_a12_en_5.1.1_3.0_1694629467102.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_l12_h240_a12_en_5.1.1_3.0_1694629467102.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_l12_h240_a12","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_l12_h240_a12", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_l12_h240_a12| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|105.1 MB| + +## References + +https://huggingface.co/eli4s/Bert-L12-h240-A12 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_l12_h256_a4_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_l12_h256_a4_en.md new file mode 100644 index 00000000000000..63e2963fe9b7ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_l12_h256_a4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_l12_h256_a4 BertEmbeddings from eli4s +author: John Snow Labs +name: bert_l12_h256_a4 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_l12_h256_a4` is a English model originally trained by eli4s. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_l12_h256_a4_en_5.1.1_3.0_1694629695737.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_l12_h256_a4_en_5.1.1_3.0_1694629695737.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_l12_h256_a4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_l12_h256_a4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_l12_h256_a4| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|112.8 MB| + +## References + +https://huggingface.co/eli4s/Bert-L12-h256-A4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_l12_h384_a6_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_l12_h384_a6_en.md new file mode 100644 index 00000000000000..2bda8fd14d7977 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_l12_h384_a6_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_l12_h384_a6 BertEmbeddings from eli4s +author: John Snow Labs +name: bert_l12_h384_a6 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_l12_h384_a6` is a English model originally trained by eli4s. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_l12_h384_a6_en_5.1.1_3.0_1694629972509.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_l12_h384_a6_en_5.1.1_3.0_1694629972509.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_l12_h384_a6","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_l12_h384_a6", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_l12_h384_a6| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|177.8 MB| + +## References + +https://huggingface.co/eli4s/Bert-L12-h384-A6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_large_cased_finetuned_hkdse_english_paper4_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_large_cased_finetuned_hkdse_english_paper4_en.md new file mode 100644 index 00000000000000..1906ba067e6f5b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_large_cased_finetuned_hkdse_english_paper4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_large_cased_finetuned_hkdse_english_paper4 BertEmbeddings from Wootang01 +author: John Snow Labs +name: bert_large_cased_finetuned_hkdse_english_paper4 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_cased_finetuned_hkdse_english_paper4` is a English model originally trained by Wootang01. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_cased_finetuned_hkdse_english_paper4_en_5.1.1_3.0_1694647633988.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_cased_finetuned_hkdse_english_paper4_en_5.1.1_3.0_1694647633988.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_large_cased_finetuned_hkdse_english_paper4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_large_cased_finetuned_hkdse_english_paper4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_cased_finetuned_hkdse_english_paper4| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/Wootang01/bert-large-cased-finetuned-hkdse-english-paper4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_large_cased_portuguese_lenerbr_vittorio_girardi_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_large_cased_portuguese_lenerbr_vittorio_girardi_en.md new file mode 100644 index 00000000000000..2d00656cbff7e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_large_cased_portuguese_lenerbr_vittorio_girardi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_large_cased_portuguese_lenerbr_vittorio_girardi BertEmbeddings from vittorio-girardi +author: John Snow Labs +name: bert_large_cased_portuguese_lenerbr_vittorio_girardi +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_cased_portuguese_lenerbr_vittorio_girardi` is a English model originally trained by vittorio-girardi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_cased_portuguese_lenerbr_vittorio_girardi_en_5.1.1_3.0_1694635205835.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_cased_portuguese_lenerbr_vittorio_girardi_en_5.1.1_3.0_1694635205835.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_large_cased_portuguese_lenerbr_vittorio_girardi","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_large_cased_portuguese_lenerbr_vittorio_girardi", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_cased_portuguese_lenerbr_vittorio_girardi| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/vittorio-girardi/bert-large-cased-pt-lenerbr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_0_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_0_en.md new file mode 100644 index 00000000000000..7c833d1b9264cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_0 BertEmbeddings from jojoUla +author: John Snow Labs +name: bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_0 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_0` is a English model originally trained by jojoUla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_0_en_5.1.1_3.0_1694645453821.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_0_en_5.1.1_3.0_1694645453821.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_0| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/jojoUla/bert-large-cased-sigir-support-refute-no-label-40-2nd-test-LR10-8-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_1_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_1_en.md new file mode 100644 index 00000000000000..6fda8b16781464 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_1 BertEmbeddings from jojoUla +author: John Snow Labs +name: bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_1 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_1` is a English model originally trained by jojoUla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_1_en_5.1.1_3.0_1694646476945.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_1_en_5.1.1_3.0_1694646476945.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_1| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/jojoUla/bert-large-cased-sigir-support-refute-no-label-40-2nd-test-LR10-8-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_fast_24_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_fast_24_en.md new file mode 100644 index 00000000000000..28a34cb128ea8f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_fast_24_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_fast_24 BertEmbeddings from jojoUla +author: John Snow Labs +name: bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_fast_24 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_fast_24` is a English model originally trained by jojoUla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_fast_24_en_5.1.1_3.0_1694570634995.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_fast_24_en_5.1.1_3.0_1694570634995.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_fast_24","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_fast_24", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_fast_24| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/jojoUla/bert-large-cased-sigir-support-refute-no-label-40-2nd-test-LR10-8-fast-24 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_18_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_18_en.md new file mode 100644 index 00000000000000..32daae69f07e18 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_18_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_18 BertEmbeddings from jojoUla +author: John Snow Labs +name: bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_18 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_18` is a English model originally trained by jojoUla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_18_en_5.1.1_3.0_1694575920293.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_18_en_5.1.1_3.0_1694575920293.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_18","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_18", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_18| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/jojoUla/bert-large-cased-sigir-support-refute-no-label-40-2nd-test-LR50-8-fast-18 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_5_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_5_en.md new file mode 100644 index 00000000000000..b555a847265302 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_5 BertEmbeddings from jojoUla +author: John Snow Labs +name: bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_5 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_5` is a English model originally trained by jojoUla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_5_en_5.1.1_3.0_1694571831003.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_5_en_5.1.1_3.0_1694571831003.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_5| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/jojoUla/bert-large-cased-sigir-support-refute-no-label-40-2nd-test-LR50-8-fast-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_7_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_7_en.md new file mode 100644 index 00000000000000..ef92da68f18d56 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_7_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_7 BertEmbeddings from jojoUla +author: John Snow Labs +name: bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_7 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_7` is a English model originally trained by jojoUla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_7_en_5.1.1_3.0_1694572454796.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_7_en_5.1.1_3.0_1694572454796.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_7","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_7", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_7| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/jojoUla/bert-large-cased-sigir-support-refute-no-label-40-2nd-test-LR50-8-fast-7 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_large_cased_whole_word_masking_finetuned_bert_mlm6_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_large_cased_whole_word_masking_finetuned_bert_mlm6_en.md new file mode 100644 index 00000000000000..50264b05b27605 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_large_cased_whole_word_masking_finetuned_bert_mlm6_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_large_cased_whole_word_masking_finetuned_bert_mlm6 BertEmbeddings from himanimaheshwari3 +author: John Snow Labs +name: bert_large_cased_whole_word_masking_finetuned_bert_mlm6 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_cased_whole_word_masking_finetuned_bert_mlm6` is a English model originally trained by himanimaheshwari3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_cased_whole_word_masking_finetuned_bert_mlm6_en_5.1.1_3.0_1694643215167.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_cased_whole_word_masking_finetuned_bert_mlm6_en_5.1.1_3.0_1694643215167.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_large_cased_whole_word_masking_finetuned_bert_mlm6","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_large_cased_whole_word_masking_finetuned_bert_mlm6", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_cased_whole_word_masking_finetuned_bert_mlm6| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/himanimaheshwari3/bert-large-cased-whole-word-masking-finetuned-BERT-mlm6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_large_uncased_finetuned_bert_mlm5_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_large_uncased_finetuned_bert_mlm5_en.md new file mode 100644 index 00000000000000..0dd46dc6a2b92e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_large_uncased_finetuned_bert_mlm5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_large_uncased_finetuned_bert_mlm5 BertEmbeddings from himanimaheshwari3 +author: John Snow Labs +name: bert_large_uncased_finetuned_bert_mlm5 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_finetuned_bert_mlm5` is a English model originally trained by himanimaheshwari3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_finetuned_bert_mlm5_en_5.1.1_3.0_1694642005687.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_finetuned_bert_mlm5_en_5.1.1_3.0_1694642005687.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_large_uncased_finetuned_bert_mlm5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_large_uncased_finetuned_bert_mlm5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_finetuned_bert_mlm5| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/himanimaheshwari3/bert-large-uncased-finetuned-bert-mlm5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_large_uncased_sparse_80_1x4_block_pruneofa_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_large_uncased_sparse_80_1x4_block_pruneofa_en.md new file mode 100644 index 00000000000000..ed30eb4cde1dd6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_large_uncased_sparse_80_1x4_block_pruneofa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_large_uncased_sparse_80_1x4_block_pruneofa BertEmbeddings from Intel +author: John Snow Labs +name: bert_large_uncased_sparse_80_1x4_block_pruneofa +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_sparse_80_1x4_block_pruneofa` is a English model originally trained by Intel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_sparse_80_1x4_block_pruneofa_en_5.1.1_3.0_1694621561165.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_sparse_80_1x4_block_pruneofa_en_5.1.1_3.0_1694621561165.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_large_uncased_sparse_80_1x4_block_pruneofa","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_large_uncased_sparse_80_1x4_block_pruneofa", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_sparse_80_1x4_block_pruneofa| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|494.9 MB| + +## References + +https://huggingface.co/Intel/bert-large-uncased-sparse-80-1x4-block-pruneofa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_large_uncased_whole_word_masking_finetuned_bert_mlm_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_large_uncased_whole_word_masking_finetuned_bert_mlm_en.md new file mode 100644 index 00000000000000..cd6739e3038b81 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_large_uncased_whole_word_masking_finetuned_bert_mlm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_large_uncased_whole_word_masking_finetuned_bert_mlm BertEmbeddings from himanimaheshwari3 +author: John Snow Labs +name: bert_large_uncased_whole_word_masking_finetuned_bert_mlm +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_whole_word_masking_finetuned_bert_mlm` is a English model originally trained by himanimaheshwari3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_whole_word_masking_finetuned_bert_mlm_en_5.1.1_3.0_1694641067297.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_whole_word_masking_finetuned_bert_mlm_en_5.1.1_3.0_1694641067297.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_large_uncased_whole_word_masking_finetuned_bert_mlm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_large_uncased_whole_word_masking_finetuned_bert_mlm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_whole_word_masking_finetuned_bert_mlm| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/himanimaheshwari3/bert-large-uncased-whole-word-masking-finetuned-bert-mlm \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_medium_arapoembert_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_medium_arapoembert_en.md new file mode 100644 index 00000000000000..0a003e2ed46b26 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_medium_arapoembert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_medium_arapoembert BertEmbeddings from faisalq +author: John Snow Labs +name: bert_medium_arapoembert +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_medium_arapoembert` is a English model originally trained by faisalq. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_medium_arapoembert_en_5.1.1_3.0_1694614622989.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_medium_arapoembert_en_5.1.1_3.0_1694614622989.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_medium_arapoembert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_medium_arapoembert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_medium_arapoembert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|356.2 MB| + +## References + +https://huggingface.co/faisalq/bert-medium-arapoembert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_multilingial_geolocation_prediction_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_multilingial_geolocation_prediction_en.md new file mode 100644 index 00000000000000..6a60d1b50a7c64 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_multilingial_geolocation_prediction_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_multilingial_geolocation_prediction BertEmbeddings from k4tel +author: John Snow Labs +name: bert_multilingial_geolocation_prediction +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_multilingial_geolocation_prediction` is a English model originally trained by k4tel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_multilingial_geolocation_prediction_en_5.1.1_3.0_1694601831218.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_multilingial_geolocation_prediction_en_5.1.1_3.0_1694601831218.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_multilingial_geolocation_prediction","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_multilingial_geolocation_prediction", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_multilingial_geolocation_prediction| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|663.2 MB| + +## References + +https://huggingface.co/k4tel/bert-multilingial-geolocation-prediction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_pretrain_onlydj96_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_pretrain_onlydj96_en.md new file mode 100644 index 00000000000000..71165441c55e35 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_pretrain_onlydj96_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_pretrain_onlydj96 BertEmbeddings from onlydj96 +author: John Snow Labs +name: bert_pretrain_onlydj96 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_pretrain_onlydj96` is a English model originally trained by onlydj96. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_pretrain_onlydj96_en_5.1.1_3.0_1694629039142.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_pretrain_onlydj96_en_5.1.1_3.0_1694629039142.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_pretrain_onlydj96","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_pretrain_onlydj96", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_pretrain_onlydj96| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|412.6 MB| + +## References + +https://huggingface.co/onlydj96/bert_pretrain \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_pretraining_gaudi_2_batch_size_32_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_pretraining_gaudi_2_batch_size_32_en.md new file mode 100644 index 00000000000000..8936eb5fcbbbe8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_pretraining_gaudi_2_batch_size_32_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_pretraining_gaudi_2_batch_size_32 BertEmbeddings from regisss +author: John Snow Labs +name: bert_pretraining_gaudi_2_batch_size_32 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_pretraining_gaudi_2_batch_size_32` is a English model originally trained by regisss. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_pretraining_gaudi_2_batch_size_32_en_5.1.1_3.0_1694646591739.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_pretraining_gaudi_2_batch_size_32_en_5.1.1_3.0_1694646591739.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_pretraining_gaudi_2_batch_size_32","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_pretraining_gaudi_2_batch_size_32", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_pretraining_gaudi_2_batch_size_32| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|412.3 MB| + +## References + +https://huggingface.co/regisss/bert-pretraining-gaudi-2-batch-size-32 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_pretraining_gaudi_2_batch_size_64_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_pretraining_gaudi_2_batch_size_64_en.md new file mode 100644 index 00000000000000..ad2d5da98c2987 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_pretraining_gaudi_2_batch_size_64_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_pretraining_gaudi_2_batch_size_64 BertEmbeddings from regisss +author: John Snow Labs +name: bert_pretraining_gaudi_2_batch_size_64 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_pretraining_gaudi_2_batch_size_64` is a English model originally trained by regisss. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_pretraining_gaudi_2_batch_size_64_en_5.1.1_3.0_1694648529845.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_pretraining_gaudi_2_batch_size_64_en_5.1.1_3.0_1694648529845.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_pretraining_gaudi_2_batch_size_64","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_pretraining_gaudi_2_batch_size_64", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_pretraining_gaudi_2_batch_size_64| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|412.3 MB| + +## References + +https://huggingface.co/regisss/bert-pretraining-gaudi-2-batch-size-64 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_small_finetuned_legal_contracts_larger4010_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_small_finetuned_legal_contracts_larger4010_en.md new file mode 100644 index 00000000000000..dfd10511e03d99 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_small_finetuned_legal_contracts_larger4010_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_small_finetuned_legal_contracts_larger4010 BertEmbeddings from muhtasham +author: John Snow Labs +name: bert_small_finetuned_legal_contracts_larger4010 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_small_finetuned_legal_contracts_larger4010` is a English model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_small_finetuned_legal_contracts_larger4010_en_5.1.1_3.0_1694572295418.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_small_finetuned_legal_contracts_larger4010_en_5.1.1_3.0_1694572295418.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_small_finetuned_legal_contracts_larger4010","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_small_finetuned_legal_contracts_larger4010", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_small_finetuned_legal_contracts_larger4010| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|106.9 MB| + +## References + +https://huggingface.co/muhtasham/bert-small-finetuned-legal-contracts-larger4010 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_small_finetuned_parsed_longer50_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_small_finetuned_parsed_longer50_en.md new file mode 100644 index 00000000000000..5d895c836551c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_small_finetuned_parsed_longer50_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_small_finetuned_parsed_longer50 BertEmbeddings from muhtasham +author: John Snow Labs +name: bert_small_finetuned_parsed_longer50 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_small_finetuned_parsed_longer50` is a English model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_small_finetuned_parsed_longer50_en_5.1.1_3.0_1694573838567.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_small_finetuned_parsed_longer50_en_5.1.1_3.0_1694573838567.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_small_finetuned_parsed_longer50","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_small_finetuned_parsed_longer50", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_small_finetuned_parsed_longer50| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|106.9 MB| + +## References + +https://huggingface.co/muhtasham/bert-small-finetuned-parsed-longer50 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_tiny_historic_multilingual_cased_xx.md b/docs/_posts/ahmedlone127/2023-09-13-bert_tiny_historic_multilingual_cased_xx.md new file mode 100644 index 00000000000000..4028b6eb87f7bb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_tiny_historic_multilingual_cased_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_tiny_historic_multilingual_cased BertEmbeddings from dbmdz +author: John Snow Labs +name: bert_tiny_historic_multilingual_cased +date: 2023-09-13 +tags: [bert, xx, open_source, fill_mask, onnx] +task: Embeddings +language: xx +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tiny_historic_multilingual_cased` is a Multilingual model originally trained by dbmdz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tiny_historic_multilingual_cased_xx_5.1.1_3.0_1694599550183.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tiny_historic_multilingual_cased_xx_5.1.1_3.0_1694599550183.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_tiny_historic_multilingual_cased","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_tiny_historic_multilingual_cased", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tiny_historic_multilingual_cased| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|xx| +|Size:|17.4 MB| + +## References + +https://huggingface.co/dbmdz/bert-tiny-historic-multilingual-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bert_twitter_hashtag_en.md b/docs/_posts/ahmedlone127/2023-09-13-bert_twitter_hashtag_en.md new file mode 100644 index 00000000000000..8d629b57d1fe87 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bert_twitter_hashtag_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_twitter_hashtag BertEmbeddings from vivianhuang88 +author: John Snow Labs +name: bert_twitter_hashtag +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_twitter_hashtag` is a English model originally trained by vivianhuang88. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_twitter_hashtag_en_5.1.1_3.0_1694639330517.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_twitter_hashtag_en_5.1.1_3.0_1694639330517.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_twitter_hashtag","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_twitter_hashtag", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_twitter_hashtag| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/vivianhuang88/bert_twitter_hashtag \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bertcl_en.md b/docs/_posts/ahmedlone127/2023-09-13-bertcl_en.md new file mode 100644 index 00000000000000..2a70f8c56413b6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bertcl_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bertcl BertEmbeddings from georgepu1 +author: John Snow Labs +name: bertcl +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bertcl` is a English model originally trained by georgepu1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bertcl_en_5.1.1_3.0_1694622288755.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bertcl_en_5.1.1_3.0_1694622288755.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bertcl","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bertcl", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bertcl| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/georgepu1/bertcl \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bertinho_galician_base_cased_gl.md b/docs/_posts/ahmedlone127/2023-09-13-bertinho_galician_base_cased_gl.md new file mode 100644 index 00000000000000..2d88a958aad3c9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bertinho_galician_base_cased_gl.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Galician bertinho_galician_base_cased BertEmbeddings from dvilares +author: John Snow Labs +name: bertinho_galician_base_cased +date: 2023-09-13 +tags: [bert, gl, open_source, fill_mask, onnx] +task: Embeddings +language: gl +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bertinho_galician_base_cased` is a Galician model originally trained by dvilares. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bertinho_galician_base_cased_gl_5.1.1_3.0_1694628723686.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bertinho_galician_base_cased_gl_5.1.1_3.0_1694628723686.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bertinho_galician_base_cased","gl") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bertinho_galician_base_cased", "gl") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bertinho_galician_base_cased| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|gl| +|Size:|405.3 MB| + +## References + +https://huggingface.co/dvilares/bertinho-gl-base-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bertinho_galician_small_cased_gl.md b/docs/_posts/ahmedlone127/2023-09-13-bertinho_galician_small_cased_gl.md new file mode 100644 index 00000000000000..db394cb9c2da90 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bertinho_galician_small_cased_gl.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Galician bertinho_galician_small_cased BertEmbeddings from dvilares +author: John Snow Labs +name: bertinho_galician_small_cased +date: 2023-09-13 +tags: [bert, gl, open_source, fill_mask, onnx] +task: Embeddings +language: gl +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bertinho_galician_small_cased` is a Galician model originally trained by dvilares. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bertinho_galician_small_cased_gl_5.1.1_3.0_1694629051948.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bertinho_galician_small_cased_gl_5.1.1_3.0_1694629051948.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bertinho_galician_small_cased","gl") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bertinho_galician_small_cased", "gl") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bertinho_galician_small_cased| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|gl| +|Size:|245.8 MB| + +## References + +https://huggingface.co/dvilares/bertinho-gl-small-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bertu_mt.md b/docs/_posts/ahmedlone127/2023-09-13-bertu_mt.md new file mode 100644 index 00000000000000..31857f561ec391 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bertu_mt.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Maltese bertu BertEmbeddings from MLRS +author: John Snow Labs +name: bertu +date: 2023-09-13 +tags: [bert, mt, open_source, fill_mask, onnx] +task: Embeddings +language: mt +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bertu` is a Maltese model originally trained by MLRS. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bertu_mt_5.1.1_3.0_1694635154194.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bertu_mt_5.1.1_3.0_1694635154194.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bertu","mt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bertu", "mt") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bertu| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|mt| +|Size:|468.7 MB| + +## References + +https://huggingface.co/MLRS/BERTu \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-beto_base_cased_en.md b/docs/_posts/ahmedlone127/2023-09-13-beto_base_cased_en.md new file mode 100644 index 00000000000000..f1be7bdb9a449d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-beto_base_cased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English beto_base_cased BertEmbeddings from espejelomar +author: John Snow Labs +name: beto_base_cased +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`beto_base_cased` is a English model originally trained by espejelomar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/beto_base_cased_en_5.1.1_3.0_1694634951862.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/beto_base_cased_en_5.1.1_3.0_1694634951862.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("beto_base_cased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("beto_base_cased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|beto_base_cased| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/espejelomar/beto-base-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-betonews_bodycontext_en.md b/docs/_posts/ahmedlone127/2023-09-13-betonews_bodycontext_en.md new file mode 100644 index 00000000000000..0cde8eb199aa3b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-betonews_bodycontext_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English betonews_bodycontext BertEmbeddings from finiteautomata +author: John Snow Labs +name: betonews_bodycontext +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`betonews_bodycontext` is a English model originally trained by finiteautomata. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/betonews_bodycontext_en_5.1.1_3.0_1694638060106.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/betonews_bodycontext_en_5.1.1_3.0_1694638060106.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("betonews_bodycontext","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("betonews_bodycontext", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|betonews_bodycontext| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.3 MB| + +## References + +https://huggingface.co/finiteautomata/betonews-bodycontext \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-betonews_nonecontext_en.md b/docs/_posts/ahmedlone127/2023-09-13-betonews_nonecontext_en.md new file mode 100644 index 00000000000000..20047403ab7a2d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-betonews_nonecontext_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English betonews_nonecontext BertEmbeddings from finiteautomata +author: John Snow Labs +name: betonews_nonecontext +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`betonews_nonecontext` is a English model originally trained by finiteautomata. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/betonews_nonecontext_en_5.1.1_3.0_1694638603674.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/betonews_nonecontext_en_5.1.1_3.0_1694638603674.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("betonews_nonecontext","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("betonews_nonecontext", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|betonews_nonecontext| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.0 MB| + +## References + +https://huggingface.co/finiteautomata/betonews-nonecontext \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-betonews_tweetcontext_en.md b/docs/_posts/ahmedlone127/2023-09-13-betonews_tweetcontext_en.md new file mode 100644 index 00000000000000..62a20d9dfe8eb2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-betonews_tweetcontext_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English betonews_tweetcontext BertEmbeddings from piuba-bigdata +author: John Snow Labs +name: betonews_tweetcontext +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`betonews_tweetcontext` is a English model originally trained by piuba-bigdata. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/betonews_tweetcontext_en_5.1.1_3.0_1694639163920.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/betonews_tweetcontext_en_5.1.1_3.0_1694639163920.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("betonews_tweetcontext","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("betonews_tweetcontext", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|betonews_tweetcontext| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.1 MB| + +## References + +https://huggingface.co/piuba-bigdata/betonews-tweetcontext \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bio_clinicalbert_emilyalsentzer_en.md b/docs/_posts/ahmedlone127/2023-09-13-bio_clinicalbert_emilyalsentzer_en.md new file mode 100644 index 00000000000000..3de9706bf463b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bio_clinicalbert_emilyalsentzer_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bio_clinicalbert_emilyalsentzer BertEmbeddings from emilyalsentzer +author: John Snow Labs +name: bio_clinicalbert_emilyalsentzer +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bio_clinicalbert_emilyalsentzer` is a English model originally trained by emilyalsentzer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bio_clinicalbert_emilyalsentzer_en_5.1.1_3.0_1694631294413.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bio_clinicalbert_emilyalsentzer_en_5.1.1_3.0_1694631294413.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bio_clinicalbert_emilyalsentzer","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bio_clinicalbert_emilyalsentzer", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bio_clinicalbert_emilyalsentzer| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.3 MB| + +## References + +https://huggingface.co/emilyalsentzer/Bio_ClinicalBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bio_clinicalbert_surgicalcardiothoracic_en.md b/docs/_posts/ahmedlone127/2023-09-13-bio_clinicalbert_surgicalcardiothoracic_en.md new file mode 100644 index 00000000000000..99e04d422764e8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bio_clinicalbert_surgicalcardiothoracic_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bio_clinicalbert_surgicalcardiothoracic BertEmbeddings from Gaborandi +author: John Snow Labs +name: bio_clinicalbert_surgicalcardiothoracic +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bio_clinicalbert_surgicalcardiothoracic` is a English model originally trained by Gaborandi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bio_clinicalbert_surgicalcardiothoracic_en_5.1.1_3.0_1694620142173.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bio_clinicalbert_surgicalcardiothoracic_en_5.1.1_3.0_1694620142173.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bio_clinicalbert_surgicalcardiothoracic","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bio_clinicalbert_surgicalcardiothoracic", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bio_clinicalbert_surgicalcardiothoracic| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|402.6 MB| + +## References + +https://huggingface.co/Gaborandi/Bio_ClinicalBERT-SurgicalCardiothoracic \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bio_discharge_summary_bert_en.md b/docs/_posts/ahmedlone127/2023-09-13-bio_discharge_summary_bert_en.md new file mode 100644 index 00000000000000..8a984e5c591181 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bio_discharge_summary_bert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bio_discharge_summary_bert BertEmbeddings from emilyalsentzer +author: John Snow Labs +name: bio_discharge_summary_bert +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bio_discharge_summary_bert` is a English model originally trained by emilyalsentzer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bio_discharge_summary_bert_en_5.1.1_3.0_1694631653208.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bio_discharge_summary_bert_en_5.1.1_3.0_1694631653208.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bio_discharge_summary_bert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bio_discharge_summary_bert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bio_discharge_summary_bert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.3 MB| + +## References + +https://huggingface.co/emilyalsentzer/Bio_Discharge_Summary_BERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-biobert_base_cased_v1.2_en.md b/docs/_posts/ahmedlone127/2023-09-13-biobert_base_cased_v1.2_en.md new file mode 100644 index 00000000000000..483e34d57ef733 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-biobert_base_cased_v1.2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biobert_base_cased_v1.2 BertEmbeddings from dmis-lab +author: John Snow Labs +name: biobert_base_cased_v1.2 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biobert_base_cased_v1.2` is a English model originally trained by dmis-lab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biobert_base_cased_v1.2_en_5.1.1_3.0_1694625923823.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biobert_base_cased_v1.2_en_5.1.1_3.0_1694625923823.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("biobert_base_cased_v1.2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("biobert_base_cased_v1.2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biobert_base_cased_v1.2| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/dmis-lab/biobert-base-cased-v1.2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-biobert_base_cased_v1.2_finetuned_smpc_en.md b/docs/_posts/ahmedlone127/2023-09-13-biobert_base_cased_v1.2_finetuned_smpc_en.md new file mode 100644 index 00000000000000..c46a71da294d18 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-biobert_base_cased_v1.2_finetuned_smpc_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biobert_base_cased_v1.2_finetuned_smpc BertEmbeddings from sophy +author: John Snow Labs +name: biobert_base_cased_v1.2_finetuned_smpc +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biobert_base_cased_v1.2_finetuned_smpc` is a English model originally trained by sophy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biobert_base_cased_v1.2_finetuned_smpc_en_5.1.1_3.0_1694635769664.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biobert_base_cased_v1.2_finetuned_smpc_en_5.1.1_3.0_1694635769664.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("biobert_base_cased_v1.2_finetuned_smpc","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("biobert_base_cased_v1.2_finetuned_smpc", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biobert_base_cased_v1.2_finetuned_smpc| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/sophy/biobert-base-cased-v1.2-finetuned-smpc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-biobert_giotto_en.md b/docs/_posts/ahmedlone127/2023-09-13-biobert_giotto_en.md new file mode 100644 index 00000000000000..797b623dfe1836 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-biobert_giotto_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biobert_giotto BertEmbeddings from dpalominop +author: John Snow Labs +name: biobert_giotto +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biobert_giotto` is a English model originally trained by dpalominop. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biobert_giotto_en_5.1.1_3.0_1694626974873.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biobert_giotto_en_5.1.1_3.0_1694626974873.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("biobert_giotto","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("biobert_giotto", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biobert_giotto| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.4 MB| + +## References + +https://huggingface.co/dpalominop/biobert-giotto \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-bioformer_litcovid_en.md b/docs/_posts/ahmedlone127/2023-09-13-bioformer_litcovid_en.md new file mode 100644 index 00000000000000..612f8da6ab8354 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-bioformer_litcovid_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bioformer_litcovid BertEmbeddings from bioformers +author: John Snow Labs +name: bioformer_litcovid +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bioformer_litcovid` is a English model originally trained by bioformers. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bioformer_litcovid_en_5.1.1_3.0_1694626519703.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bioformer_litcovid_en_5.1.1_3.0_1694626519703.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bioformer_litcovid","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bioformer_litcovid", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bioformer_litcovid| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|158.5 MB| + +## References + +https://huggingface.co/bioformers/bioformer-litcovid \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-blade_english_chinese_en.md b/docs/_posts/ahmedlone127/2023-09-13-blade_english_chinese_en.md new file mode 100644 index 00000000000000..a5ca8f67175913 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-blade_english_chinese_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English blade_english_chinese BertEmbeddings from srnair +author: John Snow Labs +name: blade_english_chinese +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`blade_english_chinese` is a English model originally trained by srnair. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/blade_english_chinese_en_5.1.1_3.0_1694615597365.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/blade_english_chinese_en_5.1.1_3.0_1694615597365.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("blade_english_chinese","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("blade_english_chinese", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|blade_english_chinese| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|421.7 MB| + +## References + +https://huggingface.co/srnair/blade-en-zh \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-blade_english_russian_en.md b/docs/_posts/ahmedlone127/2023-09-13-blade_english_russian_en.md new file mode 100644 index 00000000000000..479b5c207277dd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-blade_english_russian_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English blade_english_russian BertEmbeddings from srnair +author: John Snow Labs +name: blade_english_russian +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`blade_english_russian` is a English model originally trained by srnair. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/blade_english_russian_en_5.1.1_3.0_1694620469182.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/blade_english_russian_en_5.1.1_3.0_1694620469182.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("blade_english_russian","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("blade_english_russian", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|blade_english_russian| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|428.2 MB| + +## References + +https://huggingface.co/srnair/blade-en-ru \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-burmese_awesome_mlm_model_en.md b/docs/_posts/ahmedlone127/2023-09-13-burmese_awesome_mlm_model_en.md new file mode 100644 index 00000000000000..dd73175446486a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-burmese_awesome_mlm_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_mlm_model BertEmbeddings from wajdii +author: John Snow Labs +name: burmese_awesome_mlm_model +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_mlm_model` is a English model originally trained by wajdii. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_mlm_model_en_5.1.1_3.0_1694618818921.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_mlm_model_en_5.1.1_3.0_1694618818921.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("burmese_awesome_mlm_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("burmese_awesome_mlm_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_mlm_model| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.0 MB| + +## References + +https://huggingface.co/wajdii/my_awesome_mlm_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-carlbert_webex_mlm_vignesh95_en.md b/docs/_posts/ahmedlone127/2023-09-13-carlbert_webex_mlm_vignesh95_en.md new file mode 100644 index 00000000000000..a4f09ad74f6873 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-carlbert_webex_mlm_vignesh95_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English carlbert_webex_mlm_vignesh95 BertEmbeddings from Vignesh95 +author: John Snow Labs +name: carlbert_webex_mlm_vignesh95 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`carlbert_webex_mlm_vignesh95` is a English model originally trained by Vignesh95. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/carlbert_webex_mlm_vignesh95_en_5.1.1_3.0_1694616371291.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/carlbert_webex_mlm_vignesh95_en_5.1.1_3.0_1694616371291.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("carlbert_webex_mlm_vignesh95","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("carlbert_webex_mlm_vignesh95", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|carlbert_webex_mlm_vignesh95| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/Vignesh95/carlbert-webex-mlm \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-carlbert_webex_mlm_wolof_recipient_en.md b/docs/_posts/ahmedlone127/2023-09-13-carlbert_webex_mlm_wolof_recipient_en.md new file mode 100644 index 00000000000000..68710b1f4dc39a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-carlbert_webex_mlm_wolof_recipient_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English carlbert_webex_mlm_wolof_recipient BertEmbeddings from Vignesh95 +author: John Snow Labs +name: carlbert_webex_mlm_wolof_recipient +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`carlbert_webex_mlm_wolof_recipient` is a English model originally trained by Vignesh95. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/carlbert_webex_mlm_wolof_recipient_en_5.1.1_3.0_1694616716297.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/carlbert_webex_mlm_wolof_recipient_en_5.1.1_3.0_1694616716297.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("carlbert_webex_mlm_wolof_recipient","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("carlbert_webex_mlm_wolof_recipient", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|carlbert_webex_mlm_wolof_recipient| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/Vignesh95/carlbert-webex-mlm-wo-recipient \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-cdgp_chilean_sign_language_bert_cloth_en.md b/docs/_posts/ahmedlone127/2023-09-13-cdgp_chilean_sign_language_bert_cloth_en.md new file mode 100644 index 00000000000000..1448868f11e46b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-cdgp_chilean_sign_language_bert_cloth_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English cdgp_chilean_sign_language_bert_cloth BertEmbeddings from AndyChiang +author: John Snow Labs +name: cdgp_chilean_sign_language_bert_cloth +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cdgp_chilean_sign_language_bert_cloth` is a English model originally trained by AndyChiang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cdgp_chilean_sign_language_bert_cloth_en_5.1.1_3.0_1694592903470.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cdgp_chilean_sign_language_bert_cloth_en_5.1.1_3.0_1694592903470.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("cdgp_chilean_sign_language_bert_cloth","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("cdgp_chilean_sign_language_bert_cloth", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cdgp_chilean_sign_language_bert_cloth| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/AndyChiang/cdgp-csg-bert-cloth \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-cdgp_chilean_sign_language_bert_dgen_en.md b/docs/_posts/ahmedlone127/2023-09-13-cdgp_chilean_sign_language_bert_dgen_en.md new file mode 100644 index 00000000000000..ce5acdfbd83008 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-cdgp_chilean_sign_language_bert_dgen_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English cdgp_chilean_sign_language_bert_dgen BertEmbeddings from AndyChiang +author: John Snow Labs +name: cdgp_chilean_sign_language_bert_dgen +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cdgp_chilean_sign_language_bert_dgen` is a English model originally trained by AndyChiang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cdgp_chilean_sign_language_bert_dgen_en_5.1.1_3.0_1694593050555.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cdgp_chilean_sign_language_bert_dgen_en_5.1.1_3.0_1694593050555.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("cdgp_chilean_sign_language_bert_dgen","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("cdgp_chilean_sign_language_bert_dgen", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cdgp_chilean_sign_language_bert_dgen| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/AndyChiang/cdgp-csg-bert-dgen \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-chemical_bert_uncased_finetuned_cust_c1_cust_en.md b/docs/_posts/ahmedlone127/2023-09-13-chemical_bert_uncased_finetuned_cust_c1_cust_en.md new file mode 100644 index 00000000000000..11148203990f4f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-chemical_bert_uncased_finetuned_cust_c1_cust_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English chemical_bert_uncased_finetuned_cust_c1_cust BertEmbeddings from shafin +author: John Snow Labs +name: chemical_bert_uncased_finetuned_cust_c1_cust +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chemical_bert_uncased_finetuned_cust_c1_cust` is a English model originally trained by shafin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chemical_bert_uncased_finetuned_cust_c1_cust_en_5.1.1_3.0_1694630083309.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chemical_bert_uncased_finetuned_cust_c1_cust_en_5.1.1_3.0_1694630083309.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("chemical_bert_uncased_finetuned_cust_c1_cust","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("chemical_bert_uncased_finetuned_cust_c1_cust", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chemical_bert_uncased_finetuned_cust_c1_cust| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.1 MB| + +## References + +https://huggingface.co/shafin/chemical-bert-uncased-finetuned-cust-c1-cust \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-chemical_bert_uncased_finetuned_cust_c2_en.md b/docs/_posts/ahmedlone127/2023-09-13-chemical_bert_uncased_finetuned_cust_c2_en.md new file mode 100644 index 00000000000000..290f7b4d47d88f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-chemical_bert_uncased_finetuned_cust_c2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English chemical_bert_uncased_finetuned_cust_c2 BertEmbeddings from shafin +author: John Snow Labs +name: chemical_bert_uncased_finetuned_cust_c2 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chemical_bert_uncased_finetuned_cust_c2` is a English model originally trained by shafin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chemical_bert_uncased_finetuned_cust_c2_en_5.1.1_3.0_1694633093400.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chemical_bert_uncased_finetuned_cust_c2_en_5.1.1_3.0_1694633093400.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("chemical_bert_uncased_finetuned_cust_c2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("chemical_bert_uncased_finetuned_cust_c2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chemical_bert_uncased_finetuned_cust_c2| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.1 MB| + +## References + +https://huggingface.co/shafin/chemical-bert-uncased-finetuned-cust-c2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-chemical_bert_uncased_finetuned_cust_en.md b/docs/_posts/ahmedlone127/2023-09-13-chemical_bert_uncased_finetuned_cust_en.md new file mode 100644 index 00000000000000..8e12c86e717ffa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-chemical_bert_uncased_finetuned_cust_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English chemical_bert_uncased_finetuned_cust BertEmbeddings from shafin +author: John Snow Labs +name: chemical_bert_uncased_finetuned_cust +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chemical_bert_uncased_finetuned_cust` is a English model originally trained by shafin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chemical_bert_uncased_finetuned_cust_en_5.1.1_3.0_1694625924079.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chemical_bert_uncased_finetuned_cust_en_5.1.1_3.0_1694625924079.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("chemical_bert_uncased_finetuned_cust","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("chemical_bert_uncased_finetuned_cust", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chemical_bert_uncased_finetuned_cust| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.1 MB| + +## References + +https://huggingface.co/shafin/chemical-bert-uncased-finetuned-cust \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-childes_bert_en.md b/docs/_posts/ahmedlone127/2023-09-13-childes_bert_en.md new file mode 100644 index 00000000000000..931e9114a64ad2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-childes_bert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English childes_bert BertEmbeddings from smeylan +author: John Snow Labs +name: childes_bert +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`childes_bert` is a English model originally trained by smeylan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/childes_bert_en_5.1.1_3.0_1694574765875.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/childes_bert_en_5.1.1_3.0_1694574765875.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("childes_bert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("childes_bert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|childes_bert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/smeylan/childes-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-cl_arabertv0.1_base_en.md b/docs/_posts/ahmedlone127/2023-09-13-cl_arabertv0.1_base_en.md new file mode 100644 index 00000000000000..354e265dde3e2f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-cl_arabertv0.1_base_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English cl_arabertv0.1_base BertEmbeddings from qahq +author: John Snow Labs +name: cl_arabertv0.1_base +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cl_arabertv0.1_base` is a English model originally trained by qahq. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cl_arabertv0.1_base_en_5.1.1_3.0_1694611150025.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cl_arabertv0.1_base_en_5.1.1_3.0_1694611150025.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("cl_arabertv0.1_base","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("cl_arabertv0.1_base", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cl_arabertv0.1_base| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|505.0 MB| + +## References + +https://huggingface.co/qahq/CL-AraBERTv0.1-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-clinical_bert_base_128_en.md b/docs/_posts/ahmedlone127/2023-09-13-clinical_bert_base_128_en.md new file mode 100644 index 00000000000000..f18b61939be216 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-clinical_bert_base_128_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English clinical_bert_base_128 BertEmbeddings from Tsubasaz +author: John Snow Labs +name: clinical_bert_base_128 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clinical_bert_base_128` is a English model originally trained by Tsubasaz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clinical_bert_base_128_en_5.1.1_3.0_1694572855798.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clinical_bert_base_128_en_5.1.1_3.0_1694572855798.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("clinical_bert_base_128","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("clinical_bert_base_128", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clinical_bert_base_128| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.0 MB| + +## References + +https://huggingface.co/Tsubasaz/clinical-bert-base-128 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-cocodr_base_msmarco_en.md b/docs/_posts/ahmedlone127/2023-09-13-cocodr_base_msmarco_en.md new file mode 100644 index 00000000000000..8228637f0d980e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-cocodr_base_msmarco_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English cocodr_base_msmarco BertEmbeddings from OpenMatch +author: John Snow Labs +name: cocodr_base_msmarco +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cocodr_base_msmarco` is a English model originally trained by OpenMatch. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cocodr_base_msmarco_en_5.1.1_3.0_1694609582266.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cocodr_base_msmarco_en_5.1.1_3.0_1694609582266.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("cocodr_base_msmarco","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("cocodr_base_msmarco", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cocodr_base_msmarco| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.4 MB| + +## References + +https://huggingface.co/OpenMatch/cocodr-base-msmarco \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-cocodr_base_msmarco_warmup_en.md b/docs/_posts/ahmedlone127/2023-09-13-cocodr_base_msmarco_warmup_en.md new file mode 100644 index 00000000000000..160749a93944da --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-cocodr_base_msmarco_warmup_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English cocodr_base_msmarco_warmup BertEmbeddings from OpenMatch +author: John Snow Labs +name: cocodr_base_msmarco_warmup +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cocodr_base_msmarco_warmup` is a English model originally trained by OpenMatch. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cocodr_base_msmarco_warmup_en_5.1.1_3.0_1694643215149.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cocodr_base_msmarco_warmup_en_5.1.1_3.0_1694643215149.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("cocodr_base_msmarco_warmup","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("cocodr_base_msmarco_warmup", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cocodr_base_msmarco_warmup| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.4 MB| + +## References + +https://huggingface.co/OpenMatch/cocodr-base-msmarco-warmup \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-cocodr_large_msmarco_en.md b/docs/_posts/ahmedlone127/2023-09-13-cocodr_large_msmarco_en.md new file mode 100644 index 00000000000000..c52f7bd32f4645 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-cocodr_large_msmarco_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English cocodr_large_msmarco BertEmbeddings from OpenMatch +author: John Snow Labs +name: cocodr_large_msmarco +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cocodr_large_msmarco` is a English model originally trained by OpenMatch. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cocodr_large_msmarco_en_5.1.1_3.0_1694610587271.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cocodr_large_msmarco_en_5.1.1_3.0_1694610587271.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("cocodr_large_msmarco","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("cocodr_large_msmarco", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cocodr_large_msmarco| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/OpenMatch/cocodr-large-msmarco \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-cocodr_large_msmarco_idro_only_en.md b/docs/_posts/ahmedlone127/2023-09-13-cocodr_large_msmarco_idro_only_en.md new file mode 100644 index 00000000000000..02dc5ddfc57ef7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-cocodr_large_msmarco_idro_only_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English cocodr_large_msmarco_idro_only BertEmbeddings from OpenMatch +author: John Snow Labs +name: cocodr_large_msmarco_idro_only +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cocodr_large_msmarco_idro_only` is a English model originally trained by OpenMatch. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cocodr_large_msmarco_idro_only_en_5.1.1_3.0_1694615878283.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cocodr_large_msmarco_idro_only_en_5.1.1_3.0_1694615878283.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("cocodr_large_msmarco_idro_only","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("cocodr_large_msmarco_idro_only", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cocodr_large_msmarco_idro_only| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/OpenMatch/cocodr-large-msmarco-idro-only \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-cocodr_large_msmarco_warmup_en.md b/docs/_posts/ahmedlone127/2023-09-13-cocodr_large_msmarco_warmup_en.md new file mode 100644 index 00000000000000..091ba7798f5740 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-cocodr_large_msmarco_warmup_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English cocodr_large_msmarco_warmup BertEmbeddings from OpenMatch +author: John Snow Labs +name: cocodr_large_msmarco_warmup +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cocodr_large_msmarco_warmup` is a English model originally trained by OpenMatch. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cocodr_large_msmarco_warmup_en_5.1.1_3.0_1694615098798.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cocodr_large_msmarco_warmup_en_5.1.1_3.0_1694615098798.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("cocodr_large_msmarco_warmup","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("cocodr_large_msmarco_warmup", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cocodr_large_msmarco_warmup| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/OpenMatch/cocodr-large-msmarco-warmup \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-colbert_bertnsp_220_en.md b/docs/_posts/ahmedlone127/2023-09-13-colbert_bertnsp_220_en.md new file mode 100644 index 00000000000000..46d97738416db0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-colbert_bertnsp_220_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English colbert_bertnsp_220 BertEmbeddings from approach0 +author: John Snow Labs +name: colbert_bertnsp_220 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`colbert_bertnsp_220` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/colbert_bertnsp_220_en_5.1.1_3.0_1694632363032.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/colbert_bertnsp_220_en_5.1.1_3.0_1694632363032.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("colbert_bertnsp_220","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("colbert_bertnsp_220", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|colbert_bertnsp_220| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/approach0/colbert-bertnsp-220 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-colbert_bertnsp_600_en.md b/docs/_posts/ahmedlone127/2023-09-13-colbert_bertnsp_600_en.md new file mode 100644 index 00000000000000..b5553e73bcb2ab --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-colbert_bertnsp_600_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English colbert_bertnsp_600 BertEmbeddings from approach0 +author: John Snow Labs +name: colbert_bertnsp_600 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`colbert_bertnsp_600` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/colbert_bertnsp_600_en_5.1.1_3.0_1694632961846.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/colbert_bertnsp_600_en_5.1.1_3.0_1694632961846.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("colbert_bertnsp_600","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("colbert_bertnsp_600", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|colbert_bertnsp_600| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/approach0/colbert-bertnsp-600 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-colbert_cocomae_220_en.md b/docs/_posts/ahmedlone127/2023-09-13-colbert_cocomae_220_en.md new file mode 100644 index 00000000000000..b8c5dd29fd76b8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-colbert_cocomae_220_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English colbert_cocomae_220 BertEmbeddings from approach0 +author: John Snow Labs +name: colbert_cocomae_220 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`colbert_cocomae_220` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/colbert_cocomae_220_en_5.1.1_3.0_1694631433780.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/colbert_cocomae_220_en_5.1.1_3.0_1694631433780.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("colbert_cocomae_220","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("colbert_cocomae_220", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|colbert_cocomae_220| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/approach0/colbert-cocomae-220 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-colbert_cocomae_600_en.md b/docs/_posts/ahmedlone127/2023-09-13-colbert_cocomae_600_en.md new file mode 100644 index 00000000000000..efca6eb167b23c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-colbert_cocomae_600_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English colbert_cocomae_600 BertEmbeddings from approach0 +author: John Snow Labs +name: colbert_cocomae_600 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`colbert_cocomae_600` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/colbert_cocomae_600_en_5.1.1_3.0_1694631908560.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/colbert_cocomae_600_en_5.1.1_3.0_1694631908560.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("colbert_cocomae_600","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("colbert_cocomae_600", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|colbert_cocomae_600| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/approach0/colbert-cocomae-600 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-condenser_large_en.md b/docs/_posts/ahmedlone127/2023-09-13-condenser_large_en.md new file mode 100644 index 00000000000000..febd8c7a23d7af --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-condenser_large_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English condenser_large BertEmbeddings from OpenMatch +author: John Snow Labs +name: condenser_large +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`condenser_large` is a English model originally trained by OpenMatch. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/condenser_large_en_5.1.1_3.0_1694612875253.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/condenser_large_en_5.1.1_3.0_1694612875253.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("condenser_large","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("condenser_large", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|condenser_large| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/OpenMatch/condenser-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-cord_scibert_en.md b/docs/_posts/ahmedlone127/2023-09-13-cord_scibert_en.md new file mode 100644 index 00000000000000..fb2738000a6231 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-cord_scibert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English cord_scibert BertEmbeddings from athiban2001 +author: John Snow Labs +name: cord_scibert +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cord_scibert` is a English model originally trained by athiban2001. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cord_scibert_en_5.1.1_3.0_1694617688873.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cord_scibert_en_5.1.1_3.0_1694617688873.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("cord_scibert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("cord_scibert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cord_scibert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|412.7 MB| + +## References + +https://huggingface.co/athiban2001/cord-scibert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-corsican_condenser_large_en.md b/docs/_posts/ahmedlone127/2023-09-13-corsican_condenser_large_en.md new file mode 100644 index 00000000000000..b44932a1fd53f6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-corsican_condenser_large_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English corsican_condenser_large BertEmbeddings from OpenMatch +author: John Snow Labs +name: corsican_condenser_large +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`corsican_condenser_large` is a English model originally trained by OpenMatch. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/corsican_condenser_large_en_5.1.1_3.0_1694613630133.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/corsican_condenser_large_en_5.1.1_3.0_1694613630133.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("corsican_condenser_large","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("corsican_condenser_large", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|corsican_condenser_large| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/OpenMatch/co-condenser-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-corsican_condenser_large_msmarco_en.md b/docs/_posts/ahmedlone127/2023-09-13-corsican_condenser_large_msmarco_en.md new file mode 100644 index 00000000000000..82cb24462acbbe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-corsican_condenser_large_msmarco_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English corsican_condenser_large_msmarco BertEmbeddings from OpenMatch +author: John Snow Labs +name: corsican_condenser_large_msmarco +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`corsican_condenser_large_msmarco` is a English model originally trained by OpenMatch. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/corsican_condenser_large_msmarco_en_5.1.1_3.0_1694614396061.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/corsican_condenser_large_msmarco_en_5.1.1_3.0_1694614396061.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("corsican_condenser_large_msmarco","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("corsican_condenser_large_msmarco", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|corsican_condenser_large_msmarco| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/OpenMatch/co-condenser-large-msmarco \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-costa_wiki_en.md b/docs/_posts/ahmedlone127/2023-09-13-costa_wiki_en.md new file mode 100644 index 00000000000000..a345b27c2093bf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-costa_wiki_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English costa_wiki BertEmbeddings from xyma +author: John Snow Labs +name: costa_wiki +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`costa_wiki` is a English model originally trained by xyma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/costa_wiki_en_5.1.1_3.0_1694624815803.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/costa_wiki_en_5.1.1_3.0_1694624815803.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("costa_wiki","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("costa_wiki", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|costa_wiki| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.3 MB| + +## References + +https://huggingface.co/xyma/COSTA-wiki \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-cotmae_base_uncased_en.md b/docs/_posts/ahmedlone127/2023-09-13-cotmae_base_uncased_en.md new file mode 100644 index 00000000000000..ec4b67fc38f637 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-cotmae_base_uncased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English cotmae_base_uncased BertEmbeddings from caskcsg +author: John Snow Labs +name: cotmae_base_uncased +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cotmae_base_uncased` is a English model originally trained by caskcsg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cotmae_base_uncased_en_5.1.1_3.0_1694617137606.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cotmae_base_uncased_en_5.1.1_3.0_1694617137606.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("cotmae_base_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("cotmae_base_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cotmae_base_uncased| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.7 MB| + +## References + +https://huggingface.co/caskcsg/cotmae_base_uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-ct_pubmedbert_re_en.md b/docs/_posts/ahmedlone127/2023-09-13-ct_pubmedbert_re_en.md new file mode 100644 index 00000000000000..c624a3e3d61279 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-ct_pubmedbert_re_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ct_pubmedbert_re BertEmbeddings from zhangzeyu +author: John Snow Labs +name: ct_pubmedbert_re +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ct_pubmedbert_re` is a English model originally trained by zhangzeyu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ct_pubmedbert_re_en_5.1.1_3.0_1694640469042.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ct_pubmedbert_re_en_5.1.1_3.0_1694640469042.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("ct_pubmedbert_re","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("ct_pubmedbert_re", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ct_pubmedbert_re| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|408.0 MB| + +## References + +https://huggingface.co/zhangzeyu/CT-PubMedBERT-RE \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-cxr_bioclinicalbert_chunkedv1_en.md b/docs/_posts/ahmedlone127/2023-09-13-cxr_bioclinicalbert_chunkedv1_en.md new file mode 100644 index 00000000000000..be8ae5fc9970d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-cxr_bioclinicalbert_chunkedv1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English cxr_bioclinicalbert_chunkedv1 BertEmbeddings from ICLbioengNLP +author: John Snow Labs +name: cxr_bioclinicalbert_chunkedv1 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cxr_bioclinicalbert_chunkedv1` is a English model originally trained by ICLbioengNLP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cxr_bioclinicalbert_chunkedv1_en_5.1.1_3.0_1694616858394.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cxr_bioclinicalbert_chunkedv1_en_5.1.1_3.0_1694616858394.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("cxr_bioclinicalbert_chunkedv1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("cxr_bioclinicalbert_chunkedv1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cxr_bioclinicalbert_chunkedv1| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ICLbioengNLP/CXR_BioClinicalBERT_chunkedv1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-cxr_bioclinicalbert_mlm_en.md b/docs/_posts/ahmedlone127/2023-09-13-cxr_bioclinicalbert_mlm_en.md new file mode 100644 index 00000000000000..86a2d2e4dc25c6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-cxr_bioclinicalbert_mlm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English cxr_bioclinicalbert_mlm BertEmbeddings from ICLbioengNLP +author: John Snow Labs +name: cxr_bioclinicalbert_mlm +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cxr_bioclinicalbert_mlm` is a English model originally trained by ICLbioengNLP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cxr_bioclinicalbert_mlm_en_5.1.1_3.0_1694628615018.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cxr_bioclinicalbert_mlm_en_5.1.1_3.0_1694628615018.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("cxr_bioclinicalbert_mlm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("cxr_bioclinicalbert_mlm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cxr_bioclinicalbert_mlm| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.0 MB| + +## References + +https://huggingface.co/ICLbioengNLP/CXR_BioClinicalBERT_MLM \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-cxr_bioclinicalbert_v1_en.md b/docs/_posts/ahmedlone127/2023-09-13-cxr_bioclinicalbert_v1_en.md new file mode 100644 index 00000000000000..dc32f1b8ccf23b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-cxr_bioclinicalbert_v1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English cxr_bioclinicalbert_v1 BertEmbeddings from dorltcheng +author: John Snow Labs +name: cxr_bioclinicalbert_v1 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cxr_bioclinicalbert_v1` is a English model originally trained by dorltcheng. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cxr_bioclinicalbert_v1_en_5.1.1_3.0_1694613574482.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cxr_bioclinicalbert_v1_en_5.1.1_3.0_1694613574482.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("cxr_bioclinicalbert_v1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("cxr_bioclinicalbert_v1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cxr_bioclinicalbert_v1| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.3 MB| + +## References + +https://huggingface.co/dorltcheng/CXR_BioClinicalBERT_v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dabert_multi_en.md b/docs/_posts/ahmedlone127/2023-09-13-dabert_multi_en.md new file mode 100644 index 00000000000000..599fbb123c6dbe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dabert_multi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dabert_multi BertEmbeddings from christofid +author: John Snow Labs +name: dabert_multi +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dabert_multi` is a English model originally trained by christofid. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dabert_multi_en_5.1.1_3.0_1694647641174.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dabert_multi_en_5.1.1_3.0_1694647641174.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dabert_multi","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dabert_multi", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dabert_multi| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.0 MB| + +## References + +https://huggingface.co/christofid/dabert-multi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dajobbert_base_uncased_da.md b/docs/_posts/ahmedlone127/2023-09-13-dajobbert_base_uncased_da.md new file mode 100644 index 00000000000000..a5c41aaf14f5b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dajobbert_base_uncased_da.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Danish dajobbert_base_uncased BertEmbeddings from jjzha +author: John Snow Labs +name: dajobbert_base_uncased +date: 2023-09-13 +tags: [bert, da, open_source, fill_mask, onnx] +task: Embeddings +language: da +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dajobbert_base_uncased` is a Danish model originally trained by jjzha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dajobbert_base_uncased_da_5.1.1_3.0_1694632323819.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dajobbert_base_uncased_da_5.1.1_3.0_1694632323819.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dajobbert_base_uncased","da") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dajobbert_base_uncased", "da") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dajobbert_base_uncased| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|da| +|Size:|411.3 MB| + +## References + +https://huggingface.co/jjzha/dajobbert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dapbert_en.md b/docs/_posts/ahmedlone127/2023-09-13-dapbert_en.md new file mode 100644 index 00000000000000..c85685f66e7789 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dapbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dapbert BertEmbeddings from christofid +author: John Snow Labs +name: dapbert +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dapbert` is a English model originally trained by christofid. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dapbert_en_5.1.1_3.0_1694645713628.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dapbert_en_5.1.1_3.0_1694645713628.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dapbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dapbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dapbert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.0 MB| + +## References + +https://huggingface.co/christofid/dapbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dapscibert_en.md b/docs/_posts/ahmedlone127/2023-09-13-dapscibert_en.md new file mode 100644 index 00000000000000..f9ef2c5902ac33 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dapscibert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dapscibert BertEmbeddings from christofid +author: John Snow Labs +name: dapscibert +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dapscibert` is a English model originally trained by christofid. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dapscibert_en_5.1.1_3.0_1694646154402.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dapscibert_en_5.1.1_3.0_1694646154402.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dapscibert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dapscibert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dapscibert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/christofid/dapscibert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_bertnsp_020_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_bertnsp_020_en.md new file mode 100644 index 00000000000000..38f4bfbc4f4b8d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_bertnsp_020_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_bertnsp_020 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_bertnsp_020 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_bertnsp_020` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_bertnsp_020_en_5.1.1_3.0_1694621826042.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_bertnsp_020_en_5.1.1_3.0_1694621826042.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_bertnsp_020","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_bertnsp_020", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_bertnsp_020| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/approach0/dpr-bertnsp-020 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_bertnsp_120_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_bertnsp_120_en.md new file mode 100644 index 00000000000000..62a6ed916cfdca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_bertnsp_120_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_bertnsp_120 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_bertnsp_120 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_bertnsp_120` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_bertnsp_120_en_5.1.1_3.0_1694622137182.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_bertnsp_120_en_5.1.1_3.0_1694622137182.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_bertnsp_120","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_bertnsp_120", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_bertnsp_120| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/approach0/dpr-bertnsp-120 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_bertnsp_220_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_bertnsp_220_en.md new file mode 100644 index 00000000000000..cf965b9cc9cf88 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_bertnsp_220_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_bertnsp_220 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_bertnsp_220 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_bertnsp_220` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_bertnsp_220_en_5.1.1_3.0_1694622510938.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_bertnsp_220_en_5.1.1_3.0_1694622510938.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_bertnsp_220","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_bertnsp_220", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_bertnsp_220| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/approach0/dpr-bertnsp-220 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_bertnsp_320_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_bertnsp_320_en.md new file mode 100644 index 00000000000000..d4913fff9334d6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_bertnsp_320_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_bertnsp_320 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_bertnsp_320 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_bertnsp_320` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_bertnsp_320_en_5.1.1_3.0_1694622790193.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_bertnsp_320_en_5.1.1_3.0_1694622790193.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_bertnsp_320","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_bertnsp_320", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_bertnsp_320| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/approach0/dpr-bertnsp-320 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_bertnsp_520_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_bertnsp_520_en.md new file mode 100644 index 00000000000000..9810c233518a59 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_bertnsp_520_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_bertnsp_520 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_bertnsp_520 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_bertnsp_520` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_bertnsp_520_en_5.1.1_3.0_1694623237282.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_bertnsp_520_en_5.1.1_3.0_1694623237282.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_bertnsp_520","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_bertnsp_520", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_bertnsp_520| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/approach0/dpr-bertnsp-520 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_cocomae_020_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_cocomae_020_en.md new file mode 100644 index 00000000000000..ca13cd1b8e3cce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_cocomae_020_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_cocomae_020 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_cocomae_020 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_cocomae_020` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_cocomae_020_en_5.1.1_3.0_1694619413332.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_cocomae_020_en_5.1.1_3.0_1694619413332.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_cocomae_020","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_cocomae_020", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_cocomae_020| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/approach0/dpr-cocomae-020 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_cocomae_120_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_cocomae_120_en.md new file mode 100644 index 00000000000000..29e7b1aa1d0bbc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_cocomae_120_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_cocomae_120 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_cocomae_120 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_cocomae_120` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_cocomae_120_en_5.1.1_3.0_1694619848167.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_cocomae_120_en_5.1.1_3.0_1694619848167.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_cocomae_120","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_cocomae_120", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_cocomae_120| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/approach0/dpr-cocomae-120 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_cocomae_220_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_cocomae_220_en.md new file mode 100644 index 00000000000000..2ab4db6d7de341 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_cocomae_220_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_cocomae_220 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_cocomae_220 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_cocomae_220` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_cocomae_220_en_5.1.1_3.0_1694620314728.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_cocomae_220_en_5.1.1_3.0_1694620314728.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_cocomae_220","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_cocomae_220", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_cocomae_220| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/approach0/dpr-cocomae-220 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_cocomae_320_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_cocomae_320_en.md new file mode 100644 index 00000000000000..3f35572f9173df --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_cocomae_320_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_cocomae_320 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_cocomae_320 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_cocomae_320` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_cocomae_320_en_5.1.1_3.0_1694620794927.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_cocomae_320_en_5.1.1_3.0_1694620794927.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_cocomae_320","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_cocomae_320", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_cocomae_320| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/approach0/dpr-cocomae-320 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_cocomae_520_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_cocomae_520_en.md new file mode 100644 index 00000000000000..c0f019de3ba24e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_cocomae_520_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_cocomae_520 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_cocomae_520 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_cocomae_520` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_cocomae_520_en_5.1.1_3.0_1694621336068.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_cocomae_520_en_5.1.1_3.0_1694621336068.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_cocomae_520","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_cocomae_520", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_cocomae_520| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/approach0/dpr-cocomae-520 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_cocondenser_020_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_cocondenser_020_en.md new file mode 100644 index 00000000000000..3de166049c912d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_cocondenser_020_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_cocondenser_020 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_cocondenser_020 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_cocondenser_020` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_cocondenser_020_en_5.1.1_3.0_1694623728257.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_cocondenser_020_en_5.1.1_3.0_1694623728257.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_cocondenser_020","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_cocondenser_020", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_cocondenser_020| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/approach0/dpr-cocondenser-020 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_cocondenser_120_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_cocondenser_120_en.md new file mode 100644 index 00000000000000..b8817b95d90bcb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_cocondenser_120_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_cocondenser_120 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_cocondenser_120 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_cocondenser_120` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_cocondenser_120_en_5.1.1_3.0_1694623975734.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_cocondenser_120_en_5.1.1_3.0_1694623975734.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_cocondenser_120","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_cocondenser_120", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_cocondenser_120| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/approach0/dpr-cocondenser-120 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_cocondenser_220_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_cocondenser_220_en.md new file mode 100644 index 00000000000000..8b913747e55857 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_cocondenser_220_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_cocondenser_220 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_cocondenser_220 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_cocondenser_220` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_cocondenser_220_en_5.1.1_3.0_1694624315821.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_cocondenser_220_en_5.1.1_3.0_1694624315821.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_cocondenser_220","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_cocondenser_220", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_cocondenser_220| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/approach0/dpr-cocondenser-220 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_cocondenser_320_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_cocondenser_320_en.md new file mode 100644 index 00000000000000..6959bcc77bb353 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_cocondenser_320_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_cocondenser_320 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_cocondenser_320 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_cocondenser_320` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_cocondenser_320_en_5.1.1_3.0_1694624581698.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_cocondenser_320_en_5.1.1_3.0_1694624581698.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_cocondenser_320","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_cocondenser_320", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_cocondenser_320| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/approach0/dpr-cocondenser-320 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_cocondenser_520_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_cocondenser_520_en.md new file mode 100644 index 00000000000000..7fb21f76226d8c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_cocondenser_520_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_cocondenser_520 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_cocondenser_520 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_cocondenser_520` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_cocondenser_520_en_5.1.1_3.0_1694624864486.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_cocondenser_520_en_5.1.1_3.0_1694624864486.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_cocondenser_520","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_cocondenser_520", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_cocondenser_520| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/approach0/dpr-cocondenser-520 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_cotbert_020_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_cotbert_020_en.md new file mode 100644 index 00000000000000..84f3f8fa3309d9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_cotbert_020_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_cotbert_020 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_cotbert_020 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_cotbert_020` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_cotbert_020_en_5.1.1_3.0_1694627301790.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_cotbert_020_en_5.1.1_3.0_1694627301790.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_cotbert_020","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_cotbert_020", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_cotbert_020| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/approach0/dpr-cotbert-020 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_cotbert_120_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_cotbert_120_en.md new file mode 100644 index 00000000000000..75a0082cd419cd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_cotbert_120_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_cotbert_120 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_cotbert_120 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_cotbert_120` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_cotbert_120_en_5.1.1_3.0_1694627761408.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_cotbert_120_en_5.1.1_3.0_1694627761408.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_cotbert_120","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_cotbert_120", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_cotbert_120| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/approach0/dpr-cotbert-120 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_cotbert_220_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_cotbert_220_en.md new file mode 100644 index 00000000000000..483fa4dc5e631e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_cotbert_220_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_cotbert_220 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_cotbert_220 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_cotbert_220` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_cotbert_220_en_5.1.1_3.0_1694628278286.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_cotbert_220_en_5.1.1_3.0_1694628278286.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_cotbert_220","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_cotbert_220", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_cotbert_220| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/approach0/dpr-cotbert-220 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_cotbert_320_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_cotbert_320_en.md new file mode 100644 index 00000000000000..1ef444321ec991 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_cotbert_320_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_cotbert_320 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_cotbert_320 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_cotbert_320` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_cotbert_320_en_5.1.1_3.0_1694628675772.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_cotbert_320_en_5.1.1_3.0_1694628675772.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_cotbert_320","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_cotbert_320", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_cotbert_320| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/approach0/dpr-cotbert-320 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_cotbert_520_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_cotbert_520_en.md new file mode 100644 index 00000000000000..17c9e4e3514ffc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_cotbert_520_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_cotbert_520 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_cotbert_520 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_cotbert_520` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_cotbert_520_en_5.1.1_3.0_1694629038955.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_cotbert_520_en_5.1.1_3.0_1694629038955.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_cotbert_520","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_cotbert_520", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_cotbert_520| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/approach0/dpr-cotbert-520 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_cotmae_020_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_cotmae_020_en.md new file mode 100644 index 00000000000000..85b51b65aacf58 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_cotmae_020_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_cotmae_020 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_cotmae_020 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_cotmae_020` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_cotmae_020_en_5.1.1_3.0_1694625405708.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_cotmae_020_en_5.1.1_3.0_1694625405708.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_cotmae_020","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_cotmae_020", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_cotmae_020| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/approach0/dpr-cotmae-020 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_cotmae_120_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_cotmae_120_en.md new file mode 100644 index 00000000000000..d460cfe9823ac2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_cotmae_120_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_cotmae_120 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_cotmae_120 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_cotmae_120` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_cotmae_120_en_5.1.1_3.0_1694625762302.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_cotmae_120_en_5.1.1_3.0_1694625762302.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_cotmae_120","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_cotmae_120", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_cotmae_120| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/approach0/dpr-cotmae-120 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_cotmae_220_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_cotmae_220_en.md new file mode 100644 index 00000000000000..7ff1f9294f7e0a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_cotmae_220_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_cotmae_220 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_cotmae_220 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_cotmae_220` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_cotmae_220_en_5.1.1_3.0_1694626146098.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_cotmae_220_en_5.1.1_3.0_1694626146098.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_cotmae_220","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_cotmae_220", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_cotmae_220| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/approach0/dpr-cotmae-220 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_cotmae_320_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_cotmae_320_en.md new file mode 100644 index 00000000000000..60e1e655809b9f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_cotmae_320_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_cotmae_320 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_cotmae_320 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_cotmae_320` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_cotmae_320_en_5.1.1_3.0_1694626464675.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_cotmae_320_en_5.1.1_3.0_1694626464675.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_cotmae_320","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_cotmae_320", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_cotmae_320| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/approach0/dpr-cotmae-320 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_cotmae_520_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_cotmae_520_en.md new file mode 100644 index 00000000000000..c6a19128a03475 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_cotmae_520_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_cotmae_520 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_cotmae_520 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_cotmae_520` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_cotmae_520_en_5.1.1_3.0_1694626881352.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_cotmae_520_en_5.1.1_3.0_1694626881352.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_cotmae_520","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_cotmae_520", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_cotmae_520| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/approach0/dpr-cotmae-520 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_spanish_passage_encoder_allqa_base_es.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_spanish_passage_encoder_allqa_base_es.md new file mode 100644 index 00000000000000..85c88b6518bbe8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_spanish_passage_encoder_allqa_base_es.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Castilian, Spanish dpr_spanish_passage_encoder_allqa_base BertEmbeddings from IIC +author: John Snow Labs +name: dpr_spanish_passage_encoder_allqa_base +date: 2023-09-13 +tags: [bert, es, open_source, fill_mask, onnx] +task: Embeddings +language: es +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_spanish_passage_encoder_allqa_base` is a Castilian, Spanish model originally trained by IIC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_spanish_passage_encoder_allqa_base_es_5.1.1_3.0_1694619434308.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_spanish_passage_encoder_allqa_base_es_5.1.1_3.0_1694619434308.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_spanish_passage_encoder_allqa_base","es") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_spanish_passage_encoder_allqa_base", "es") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_spanish_passage_encoder_allqa_base| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|es| +|Size:|409.5 MB| + +## References + +https://huggingface.co/IIC/dpr-spanish-passage_encoder-allqa-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_spanish_passage_encoder_squades_base_es.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_spanish_passage_encoder_squades_base_es.md new file mode 100644 index 00000000000000..1e089f5458263f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_spanish_passage_encoder_squades_base_es.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Castilian, Spanish dpr_spanish_passage_encoder_squades_base BertEmbeddings from IIC +author: John Snow Labs +name: dpr_spanish_passage_encoder_squades_base +date: 2023-09-13 +tags: [bert, es, open_source, fill_mask, onnx] +task: Embeddings +language: es +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_spanish_passage_encoder_squades_base` is a Castilian, Spanish model originally trained by IIC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_spanish_passage_encoder_squades_base_es_5.1.1_3.0_1694612516347.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_spanish_passage_encoder_squades_base_es_5.1.1_3.0_1694612516347.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_spanish_passage_encoder_squades_base","es") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_spanish_passage_encoder_squades_base", "es") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_spanish_passage_encoder_squades_base| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|es| +|Size:|409.5 MB| + +## References + +https://huggingface.co/IIC/dpr-spanish-passage_encoder-squades-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_spanish_question_encoder_allqa_base_es.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_spanish_question_encoder_allqa_base_es.md new file mode 100644 index 00000000000000..3bbe62430dcd6b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_spanish_question_encoder_allqa_base_es.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Castilian, Spanish dpr_spanish_question_encoder_allqa_base BertEmbeddings from IIC +author: John Snow Labs +name: dpr_spanish_question_encoder_allqa_base +date: 2023-09-13 +tags: [bert, es, open_source, fill_mask, onnx] +task: Embeddings +language: es +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_spanish_question_encoder_allqa_base` is a Castilian, Spanish model originally trained by IIC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_spanish_question_encoder_allqa_base_es_5.1.1_3.0_1694619850497.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_spanish_question_encoder_allqa_base_es_5.1.1_3.0_1694619850497.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_spanish_question_encoder_allqa_base","es") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_spanish_question_encoder_allqa_base", "es") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_spanish_question_encoder_allqa_base| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|es| +|Size:|409.5 MB| + +## References + +https://huggingface.co/IIC/dpr-spanish-question_encoder-allqa-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_spanish_question_encoder_squades_base_es.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_spanish_question_encoder_squades_base_es.md new file mode 100644 index 00000000000000..51fa558ec481ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_spanish_question_encoder_squades_base_es.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Castilian, Spanish dpr_spanish_question_encoder_squades_base BertEmbeddings from IIC +author: John Snow Labs +name: dpr_spanish_question_encoder_squades_base +date: 2023-09-13 +tags: [bert, es, open_source, fill_mask, onnx] +task: Embeddings +language: es +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_spanish_question_encoder_squades_base` is a Castilian, Spanish model originally trained by IIC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_spanish_question_encoder_squades_base_es_5.1.1_3.0_1694613007694.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_spanish_question_encoder_squades_base_es_5.1.1_3.0_1694613007694.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_spanish_question_encoder_squades_base","es") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_spanish_question_encoder_squades_base", "es") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_spanish_question_encoder_squades_base| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|es| +|Size:|409.5 MB| + +## References + +https://huggingface.co/IIC/dpr-spanish-question_encoder-squades-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_vanilla_bert_020_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_vanilla_bert_020_en.md new file mode 100644 index 00000000000000..1010fac53bccba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_vanilla_bert_020_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_vanilla_bert_020 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_vanilla_bert_020 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_vanilla_bert_020` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_vanilla_bert_020_en_5.1.1_3.0_1694629366872.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_vanilla_bert_020_en_5.1.1_3.0_1694629366872.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_vanilla_bert_020","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_vanilla_bert_020", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_vanilla_bert_020| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/approach0/dpr-vanilla-bert-020 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_vanilla_bert_120_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_vanilla_bert_120_en.md new file mode 100644 index 00000000000000..1dff432c9b6fd6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_vanilla_bert_120_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_vanilla_bert_120 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_vanilla_bert_120 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_vanilla_bert_120` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_vanilla_bert_120_en_5.1.1_3.0_1694629722071.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_vanilla_bert_120_en_5.1.1_3.0_1694629722071.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_vanilla_bert_120","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_vanilla_bert_120", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_vanilla_bert_120| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/approach0/dpr-vanilla-bert-120 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_vanilla_bert_220_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_vanilla_bert_220_en.md new file mode 100644 index 00000000000000..d202ba7dad79ff --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_vanilla_bert_220_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_vanilla_bert_220 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_vanilla_bert_220 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_vanilla_bert_220` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_vanilla_bert_220_en_5.1.1_3.0_1694630120901.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_vanilla_bert_220_en_5.1.1_3.0_1694630120901.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_vanilla_bert_220","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_vanilla_bert_220", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_vanilla_bert_220| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/approach0/dpr-vanilla-bert-220 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_vanilla_bert_320_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_vanilla_bert_320_en.md new file mode 100644 index 00000000000000..d83e37f7a2dc99 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_vanilla_bert_320_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_vanilla_bert_320 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_vanilla_bert_320 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_vanilla_bert_320` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_vanilla_bert_320_en_5.1.1_3.0_1694630513466.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_vanilla_bert_320_en_5.1.1_3.0_1694630513466.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_vanilla_bert_320","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_vanilla_bert_320", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_vanilla_bert_320| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/approach0/dpr-vanilla-bert-320 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-dpr_vanilla_bert_520_en.md b/docs/_posts/ahmedlone127/2023-09-13-dpr_vanilla_bert_520_en.md new file mode 100644 index 00000000000000..9f1cf37b6f1039 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-dpr_vanilla_bert_520_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dpr_vanilla_bert_520 BertEmbeddings from approach0 +author: John Snow Labs +name: dpr_vanilla_bert_520 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dpr_vanilla_bert_520` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dpr_vanilla_bert_520_en_5.1.1_3.0_1694630979516.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dpr_vanilla_bert_520_en_5.1.1_3.0_1694630979516.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dpr_vanilla_bert_520","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dpr_vanilla_bert_520", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dpr_vanilla_bert_520| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/approach0/dpr-vanilla-bert-520 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-duck_and_cover_genre_encoder_en.md b/docs/_posts/ahmedlone127/2023-09-13-duck_and_cover_genre_encoder_en.md new file mode 100644 index 00000000000000..7047da12b2b3a5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-duck_and_cover_genre_encoder_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English duck_and_cover_genre_encoder BertEmbeddings from mnne +author: John Snow Labs +name: duck_and_cover_genre_encoder +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`duck_and_cover_genre_encoder` is a English model originally trained by mnne. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/duck_and_cover_genre_encoder_en_5.1.1_3.0_1694625838242.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/duck_and_cover_genre_encoder_en_5.1.1_3.0_1694625838242.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("duck_and_cover_genre_encoder","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("duck_and_cover_genre_encoder", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|duck_and_cover_genre_encoder| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|41.8 MB| + +## References + +https://huggingface.co/mnne/duck-and-cover-genre-encoder \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-econobert_en.md b/docs/_posts/ahmedlone127/2023-09-13-econobert_en.md new file mode 100644 index 00000000000000..200aebd06695e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-econobert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English econobert BertEmbeddings from samchain +author: John Snow Labs +name: econobert +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`econobert` is a English model originally trained by samchain. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/econobert_en_5.1.1_3.0_1694622554131.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/econobert_en_5.1.1_3.0_1694622554131.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("econobert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("econobert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|econobert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/samchain/EconoBert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-fast_food_pure_relation_v0_en.md b/docs/_posts/ahmedlone127/2023-09-13-fast_food_pure_relation_v0_en.md new file mode 100644 index 00000000000000..82b1df2016aac1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-fast_food_pure_relation_v0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English fast_food_pure_relation_v0 BertEmbeddings from yogeshchandrasekharuni +author: John Snow Labs +name: fast_food_pure_relation_v0 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fast_food_pure_relation_v0` is a English model originally trained by yogeshchandrasekharuni. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fast_food_pure_relation_v0_en_5.1.1_3.0_1694618849500.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fast_food_pure_relation_v0_en_5.1.1_3.0_1694618849500.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("fast_food_pure_relation_v0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("fast_food_pure_relation_v0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fast_food_pure_relation_v0| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/yogeshchandrasekharuni/fast-food-pure-relation-v0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-fbert_en.md b/docs/_posts/ahmedlone127/2023-09-13-fbert_en.md new file mode 100644 index 00000000000000..f8b56cc3353814 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-fbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English fbert BertEmbeddings from diptanu +author: John Snow Labs +name: fbert +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fbert` is a English model originally trained by diptanu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fbert_en_5.1.1_3.0_1694624971451.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fbert_en_5.1.1_3.0_1694624971451.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("fbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("fbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fbert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|406.2 MB| + +## References + +https://huggingface.co/diptanu/fBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-fernet_c5_cs.md b/docs/_posts/ahmedlone127/2023-09-13-fernet_c5_cs.md new file mode 100644 index 00000000000000..f7e3d06fb0d1de --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-fernet_c5_cs.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Czech fernet_c5 BertEmbeddings from fav-kky +author: John Snow Labs +name: fernet_c5 +date: 2023-09-13 +tags: [bert, cs, open_source, fill_mask, onnx] +task: Embeddings +language: cs +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fernet_c5` is a Czech model originally trained by fav-kky. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fernet_c5_cs_5.1.1_3.0_1694636619880.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fernet_c5_cs_5.1.1_3.0_1694636619880.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("fernet_c5","cs") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("fernet_c5", "cs") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fernet_c5| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|cs| +|Size:|609.5 MB| + +## References + +https://huggingface.co/fav-kky/FERNET-C5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-fernet_cc_slovak_sk.md b/docs/_posts/ahmedlone127/2023-09-13-fernet_cc_slovak_sk.md new file mode 100644 index 00000000000000..399e209c6e5596 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-fernet_cc_slovak_sk.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Slovak fernet_cc_slovak BertEmbeddings from fav-kky +author: John Snow Labs +name: fernet_cc_slovak +date: 2023-09-13 +tags: [bert, sk, open_source, fill_mask, onnx] +task: Embeddings +language: sk +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fernet_cc_slovak` is a Slovak model originally trained by fav-kky. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fernet_cc_slovak_sk_5.1.1_3.0_1694637045822.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fernet_cc_slovak_sk_5.1.1_3.0_1694637045822.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("fernet_cc_slovak","sk") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("fernet_cc_slovak", "sk") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fernet_cc_slovak| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|sk| +|Size:|609.4 MB| + +## References + +https://huggingface.co/fav-kky/FERNET-CC_sk \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-film20000bert_base_uncased_en.md b/docs/_posts/ahmedlone127/2023-09-13-film20000bert_base_uncased_en.md new file mode 100644 index 00000000000000..bd4bc9f9a5acfc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-film20000bert_base_uncased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English film20000bert_base_uncased BertEmbeddings from AmaiaSolaun +author: John Snow Labs +name: film20000bert_base_uncased +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`film20000bert_base_uncased` is a English model originally trained by AmaiaSolaun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/film20000bert_base_uncased_en_5.1.1_3.0_1694587815260.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/film20000bert_base_uncased_en_5.1.1_3.0_1694587815260.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("film20000bert_base_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("film20000bert_base_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|film20000bert_base_uncased| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/AmaiaSolaun/film20000bert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-finbert_pretrain_yiyanghkust_en.md b/docs/_posts/ahmedlone127/2023-09-13-finbert_pretrain_yiyanghkust_en.md new file mode 100644 index 00000000000000..7e19262bbe0b91 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-finbert_pretrain_yiyanghkust_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finbert_pretrain_yiyanghkust BertEmbeddings from philschmid +author: John Snow Labs +name: finbert_pretrain_yiyanghkust +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finbert_pretrain_yiyanghkust` is a English model originally trained by philschmid. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finbert_pretrain_yiyanghkust_en_5.1.1_3.0_1694563205165.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finbert_pretrain_yiyanghkust_en_5.1.1_3.0_1694563205165.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("finbert_pretrain_yiyanghkust","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("finbert_pretrain_yiyanghkust", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finbert_pretrain_yiyanghkust| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/philschmid/finbert-pretrain-yiyanghkust \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-fine_tune_bert_mlm_en.md b/docs/_posts/ahmedlone127/2023-09-13-fine_tune_bert_mlm_en.md new file mode 100644 index 00000000000000..30f540b20bfd4c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-fine_tune_bert_mlm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English fine_tune_bert_mlm BertEmbeddings from mjavadmt +author: John Snow Labs +name: fine_tune_bert_mlm +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tune_bert_mlm` is a English model originally trained by mjavadmt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tune_bert_mlm_en_5.1.1_3.0_1694622010218.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tune_bert_mlm_en_5.1.1_3.0_1694622010218.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("fine_tune_bert_mlm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("fine_tune_bert_mlm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tune_bert_mlm| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|606.4 MB| + +## References + +https://huggingface.co/mjavadmt/fine-tune-BERT-MLM \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-further_train_domain_20_en.md b/docs/_posts/ahmedlone127/2023-09-13-further_train_domain_20_en.md new file mode 100644 index 00000000000000..c7d3db75c2fdcf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-further_train_domain_20_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English further_train_domain_20 BertEmbeddings from onlydj96 +author: John Snow Labs +name: further_train_domain_20 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`further_train_domain_20` is a English model originally trained by onlydj96. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/further_train_domain_20_en_5.1.1_3.0_1694588464159.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/further_train_domain_20_en_5.1.1_3.0_1694588464159.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("further_train_domain_20","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("further_train_domain_20", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|further_train_domain_20| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|441.2 MB| + +## References + +https://huggingface.co/onlydj96/further_train_domain_20 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-german_medbert_issues_128_de.md b/docs/_posts/ahmedlone127/2023-09-13-german_medbert_issues_128_de.md new file mode 100644 index 00000000000000..94617b60f14ed3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-german_medbert_issues_128_de.md @@ -0,0 +1,93 @@ +--- +layout: model +title: German german_medbert_issues_128 BertEmbeddings from ogimgio +author: John Snow Labs +name: german_medbert_issues_128 +date: 2023-09-13 +tags: [bert, de, open_source, fill_mask, onnx] +task: Embeddings +language: de +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`german_medbert_issues_128` is a German model originally trained by ogimgio. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/german_medbert_issues_128_de_5.1.1_3.0_1694612124790.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/german_medbert_issues_128_de_5.1.1_3.0_1694612124790.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("german_medbert_issues_128","de") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("german_medbert_issues_128", "de") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|german_medbert_issues_128| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|de| +|Size:|406.9 MB| + +## References + +https://huggingface.co/ogimgio/German-MedBERT-issues-128 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-greeksocialbert_base_greek_social_media_v2_el.md b/docs/_posts/ahmedlone127/2023-09-13-greeksocialbert_base_greek_social_media_v2_el.md new file mode 100644 index 00000000000000..2eecd22367db6e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-greeksocialbert_base_greek_social_media_v2_el.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Modern Greek (1453-) greeksocialbert_base_greek_social_media_v2 BertEmbeddings from pchatz +author: John Snow Labs +name: greeksocialbert_base_greek_social_media_v2 +date: 2023-09-13 +tags: [bert, el, open_source, fill_mask, onnx] +task: Embeddings +language: el +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`greeksocialbert_base_greek_social_media_v2` is a Modern Greek (1453-) model originally trained by pchatz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/greeksocialbert_base_greek_social_media_v2_el_5.1.1_3.0_1694613573415.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/greeksocialbert_base_greek_social_media_v2_el_5.1.1_3.0_1694613573415.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("greeksocialbert_base_greek_social_media_v2","el") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("greeksocialbert_base_greek_social_media_v2", "el") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|greeksocialbert_base_greek_social_media_v2| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|el| +|Size:|421.3 MB| + +## References + +https://huggingface.co/pchatz/greeksocialbert-base-greek-social-media-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-greeksocialbert_base_greek_uncased_v1_el.md b/docs/_posts/ahmedlone127/2023-09-13-greeksocialbert_base_greek_uncased_v1_el.md new file mode 100644 index 00000000000000..c20e71174595ef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-greeksocialbert_base_greek_uncased_v1_el.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Modern Greek (1453-) greeksocialbert_base_greek_uncased_v1 BertEmbeddings from gealexandri +author: John Snow Labs +name: greeksocialbert_base_greek_uncased_v1 +date: 2023-09-13 +tags: [bert, el, open_source, fill_mask, onnx] +task: Embeddings +language: el +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`greeksocialbert_base_greek_uncased_v1` is a Modern Greek (1453-) model originally trained by gealexandri. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/greeksocialbert_base_greek_uncased_v1_el_5.1.1_3.0_1694648049169.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/greeksocialbert_base_greek_uncased_v1_el_5.1.1_3.0_1694648049169.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("greeksocialbert_base_greek_uncased_v1","el") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("greeksocialbert_base_greek_uncased_v1", "el") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|greeksocialbert_base_greek_uncased_v1| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|el| +|Size:|421.3 MB| + +## References + +https://huggingface.co/gealexandri/greeksocialbert-base-greek-uncased-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-gujarati_bert_gu.md b/docs/_posts/ahmedlone127/2023-09-13-gujarati_bert_gu.md new file mode 100644 index 00000000000000..afebfe448e345c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-gujarati_bert_gu.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Gujarati gujarati_bert BertEmbeddings from l3cube-pune +author: John Snow Labs +name: gujarati_bert +date: 2023-09-13 +tags: [bert, gu, open_source, fill_mask, onnx] +task: Embeddings +language: gu +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gujarati_bert` is a Gujarati model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gujarati_bert_gu_5.1.1_3.0_1694642031980.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gujarati_bert_gu_5.1.1_3.0_1694642031980.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("gujarati_bert","gu") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("gujarati_bert", "gu") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gujarati_bert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|gu| +|Size:|890.4 MB| + +## References + +https://huggingface.co/l3cube-pune/gujarati-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-gujiroberta_fan_en.md b/docs/_posts/ahmedlone127/2023-09-13-gujiroberta_fan_en.md new file mode 100644 index 00000000000000..ad0d5548c74829 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-gujiroberta_fan_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English gujiroberta_fan BertEmbeddings from hsc748NLP +author: John Snow Labs +name: gujiroberta_fan +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gujiroberta_fan` is a English model originally trained by hsc748NLP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gujiroberta_fan_en_5.1.1_3.0_1694602043407.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gujiroberta_fan_en_5.1.1_3.0_1694602043407.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("gujiroberta_fan","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("gujiroberta_fan", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gujiroberta_fan| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|420.2 MB| + +## References + +https://huggingface.co/hsc748NLP/GujiRoBERTa_fan \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-hindi_bert_v2_hi.md b/docs/_posts/ahmedlone127/2023-09-13-hindi_bert_v2_hi.md new file mode 100644 index 00000000000000..5f8abc7871dc57 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-hindi_bert_v2_hi.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Hindi hindi_bert_v2 BertEmbeddings from l3cube-pune +author: John Snow Labs +name: hindi_bert_v2 +date: 2023-09-13 +tags: [bert, hi, open_source, fill_mask, onnx] +task: Embeddings +language: hi +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hindi_bert_v2` is a Hindi model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hindi_bert_v2_hi_5.1.1_3.0_1694576005164.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hindi_bert_v2_hi_5.1.1_3.0_1694576005164.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("hindi_bert_v2","hi") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("hindi_bert_v2", "hi") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hindi_bert_v2| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|hi| +|Size:|890.6 MB| + +## References + +https://huggingface.co/l3cube-pune/hindi-bert-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-hindi_tweets_bert_scratch_hi.md b/docs/_posts/ahmedlone127/2023-09-13-hindi_tweets_bert_scratch_hi.md new file mode 100644 index 00000000000000..e4bae79be16f65 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-hindi_tweets_bert_scratch_hi.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Hindi hindi_tweets_bert_scratch BertEmbeddings from l3cube-pune +author: John Snow Labs +name: hindi_tweets_bert_scratch +date: 2023-09-13 +tags: [bert, hi, open_source, fill_mask, onnx] +task: Embeddings +language: hi +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hindi_tweets_bert_scratch` is a Hindi model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hindi_tweets_bert_scratch_hi_5.1.1_3.0_1694632323915.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hindi_tweets_bert_scratch_hi_5.1.1_3.0_1694632323915.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("hindi_tweets_bert_scratch","hi") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("hindi_tweets_bert_scratch", "hi") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hindi_tweets_bert_scratch| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|hi| +|Size:|470.5 MB| + +## References + +https://huggingface.co/l3cube-pune/hindi-tweets-bert-scratch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-hing_mbert_mixed_hi.md b/docs/_posts/ahmedlone127/2023-09-13-hing_mbert_mixed_hi.md new file mode 100644 index 00000000000000..d326b13d1ee1a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-hing_mbert_mixed_hi.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Hindi hing_mbert_mixed BertEmbeddings from l3cube-pune +author: John Snow Labs +name: hing_mbert_mixed +date: 2023-09-13 +tags: [bert, hi, open_source, fill_mask, onnx] +task: Embeddings +language: hi +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hing_mbert_mixed` is a Hindi model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hing_mbert_mixed_hi_5.1.1_3.0_1694618447086.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hing_mbert_mixed_hi_5.1.1_3.0_1694618447086.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("hing_mbert_mixed","hi") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("hing_mbert_mixed", "hi") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hing_mbert_mixed| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|hi| +|Size:|664.9 MB| + +## References + +https://huggingface.co/l3cube-pune/hing-mbert-mixed \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-hinglish_finetuned_en.md b/docs/_posts/ahmedlone127/2023-09-13-hinglish_finetuned_en.md new file mode 100644 index 00000000000000..d1358666fe5932 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-hinglish_finetuned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English hinglish_finetuned BertEmbeddings from ketan-rmcf +author: John Snow Labs +name: hinglish_finetuned +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hinglish_finetuned` is a English model originally trained by ketan-rmcf. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hinglish_finetuned_en_5.1.1_3.0_1694636325882.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hinglish_finetuned_en_5.1.1_3.0_1694636325882.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("hinglish_finetuned","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("hinglish_finetuned", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hinglish_finetuned| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.0 MB| + +## References + +https://huggingface.co/ketan-rmcf/hinglish-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-incel_bert_base_multilingual_cased_1000k_english_xx.md b/docs/_posts/ahmedlone127/2023-09-13-incel_bert_base_multilingual_cased_1000k_english_xx.md new file mode 100644 index 00000000000000..2c7bf1886e701b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-incel_bert_base_multilingual_cased_1000k_english_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual incel_bert_base_multilingual_cased_1000k_english BertEmbeddings from pgajo +author: John Snow Labs +name: incel_bert_base_multilingual_cased_1000k_english +date: 2023-09-13 +tags: [bert, xx, open_source, fill_mask, onnx] +task: Embeddings +language: xx +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`incel_bert_base_multilingual_cased_1000k_english` is a Multilingual model originally trained by pgajo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/incel_bert_base_multilingual_cased_1000k_english_xx_5.1.1_3.0_1694613006554.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/incel_bert_base_multilingual_cased_1000k_english_xx_5.1.1_3.0_1694613006554.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("incel_bert_base_multilingual_cased_1000k_english","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("incel_bert_base_multilingual_cased_1000k_english", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|incel_bert_base_multilingual_cased_1000k_english| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|xx| +|Size:|665.1 MB| + +## References + +https://huggingface.co/pgajo/incel-bert-base-multilingual-cased-1000k_english \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-incel_bert_base_multilingual_cased_1000k_multi_xx.md b/docs/_posts/ahmedlone127/2023-09-13-incel_bert_base_multilingual_cased_1000k_multi_xx.md new file mode 100644 index 00000000000000..ce868db95ed37e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-incel_bert_base_multilingual_cased_1000k_multi_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual incel_bert_base_multilingual_cased_1000k_multi BertEmbeddings from pgajo +author: John Snow Labs +name: incel_bert_base_multilingual_cased_1000k_multi +date: 2023-09-13 +tags: [bert, xx, open_source, fill_mask, onnx] +task: Embeddings +language: xx +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`incel_bert_base_multilingual_cased_1000k_multi` is a Multilingual model originally trained by pgajo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/incel_bert_base_multilingual_cased_1000k_multi_xx_5.1.1_3.0_1694613670535.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/incel_bert_base_multilingual_cased_1000k_multi_xx_5.1.1_3.0_1694613670535.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("incel_bert_base_multilingual_cased_1000k_multi","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("incel_bert_base_multilingual_cased_1000k_multi", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|incel_bert_base_multilingual_cased_1000k_multi| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|xx| +|Size:|665.1 MB| + +## References + +https://huggingface.co/pgajo/incel-bert-base-multilingual-cased-1000k_multi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-incel_bert_base_multilingual_cased_627k_italian_xx.md b/docs/_posts/ahmedlone127/2023-09-13-incel_bert_base_multilingual_cased_627k_italian_xx.md new file mode 100644 index 00000000000000..485adf1ed601bf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-incel_bert_base_multilingual_cased_627k_italian_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual incel_bert_base_multilingual_cased_627k_italian BertEmbeddings from pgajo +author: John Snow Labs +name: incel_bert_base_multilingual_cased_627k_italian +date: 2023-09-13 +tags: [bert, xx, open_source, fill_mask, onnx] +task: Embeddings +language: xx +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`incel_bert_base_multilingual_cased_627k_italian` is a Multilingual model originally trained by pgajo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/incel_bert_base_multilingual_cased_627k_italian_xx_5.1.1_3.0_1694612397380.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/incel_bert_base_multilingual_cased_627k_italian_xx_5.1.1_3.0_1694612397380.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("incel_bert_base_multilingual_cased_627k_italian","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("incel_bert_base_multilingual_cased_627k_italian", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|incel_bert_base_multilingual_cased_627k_italian| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|xx| +|Size:|665.0 MB| + +## References + +https://huggingface.co/pgajo/incel-bert-base-multilingual-cased-627k_italian \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-incel_bert_uncased_l_12_h_768_a_12_italian_alb3rt0_627k_italian_en.md b/docs/_posts/ahmedlone127/2023-09-13-incel_bert_uncased_l_12_h_768_a_12_italian_alb3rt0_627k_italian_en.md new file mode 100644 index 00000000000000..aa8032572192c6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-incel_bert_uncased_l_12_h_768_a_12_italian_alb3rt0_627k_italian_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English incel_bert_uncased_l_12_h_768_a_12_italian_alb3rt0_627k_italian BertEmbeddings from pgajo +author: John Snow Labs +name: incel_bert_uncased_l_12_h_768_a_12_italian_alb3rt0_627k_italian +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`incel_bert_uncased_l_12_h_768_a_12_italian_alb3rt0_627k_italian` is a English model originally trained by pgajo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/incel_bert_uncased_l_12_h_768_a_12_italian_alb3rt0_627k_italian_en_5.1.1_3.0_1694621049338.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/incel_bert_uncased_l_12_h_768_a_12_italian_alb3rt0_627k_italian_en_5.1.1_3.0_1694621049338.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("incel_bert_uncased_l_12_h_768_a_12_italian_alb3rt0_627k_italian","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("incel_bert_uncased_l_12_h_768_a_12_italian_alb3rt0_627k_italian", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|incel_bert_uncased_l_12_h_768_a_12_italian_alb3rt0_627k_italian| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|688.7 MB| + +## References + +https://huggingface.co/pgajo/incel-bert_uncased_L-12_H-768_A-12_italian_alb3rt0-627k_italian \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-indo_legalbert_id.md b/docs/_posts/ahmedlone127/2023-09-13-indo_legalbert_id.md new file mode 100644 index 00000000000000..3b082f0dec52ca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-indo_legalbert_id.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Indonesian indo_legalbert BertEmbeddings from archi-ai +author: John Snow Labs +name: indo_legalbert +date: 2023-09-13 +tags: [bert, id, open_source, fill_mask, onnx] +task: Embeddings +language: id +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indo_legalbert` is a Indonesian model originally trained by archi-ai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indo_legalbert_id_5.1.1_3.0_1694623791636.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indo_legalbert_id_5.1.1_3.0_1694623791636.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("indo_legalbert","id") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("indo_legalbert", "id") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indo_legalbert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|id| +|Size:|1.3 GB| + +## References + +https://huggingface.co/archi-ai/Indo-LegalBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-indobert_base_p2_finetuned_mer_10k_en.md b/docs/_posts/ahmedlone127/2023-09-13-indobert_base_p2_finetuned_mer_10k_en.md new file mode 100644 index 00000000000000..0d647abad05f0b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-indobert_base_p2_finetuned_mer_10k_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English indobert_base_p2_finetuned_mer_10k BertEmbeddings from stevenwh +author: John Snow Labs +name: indobert_base_p2_finetuned_mer_10k +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indobert_base_p2_finetuned_mer_10k` is a English model originally trained by stevenwh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indobert_base_p2_finetuned_mer_10k_en_5.1.1_3.0_1694636325821.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indobert_base_p2_finetuned_mer_10k_en_5.1.1_3.0_1694636325821.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("indobert_base_p2_finetuned_mer_10k","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("indobert_base_p2_finetuned_mer_10k", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indobert_base_p2_finetuned_mer_10k| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|464.2 MB| + +## References + +https://huggingface.co/stevenwh/indobert-base-p2-finetuned-mer-10k \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-indobert_base_p2_finetuned_mer_80k_en.md b/docs/_posts/ahmedlone127/2023-09-13-indobert_base_p2_finetuned_mer_80k_en.md new file mode 100644 index 00000000000000..bdbc7e1959afe3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-indobert_base_p2_finetuned_mer_80k_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English indobert_base_p2_finetuned_mer_80k BertEmbeddings from stevenwh +author: John Snow Labs +name: indobert_base_p2_finetuned_mer_80k +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indobert_base_p2_finetuned_mer_80k` is a English model originally trained by stevenwh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indobert_base_p2_finetuned_mer_80k_en_5.1.1_3.0_1694636807078.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indobert_base_p2_finetuned_mer_80k_en_5.1.1_3.0_1694636807078.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("indobert_base_p2_finetuned_mer_80k","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("indobert_base_p2_finetuned_mer_80k", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indobert_base_p2_finetuned_mer_80k| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|464.2 MB| + +## References + +https://huggingface.co/stevenwh/indobert-base-p2-finetuned-mer-80k \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-indobert_base_p2_finetuned_mer_en.md b/docs/_posts/ahmedlone127/2023-09-13-indobert_base_p2_finetuned_mer_en.md new file mode 100644 index 00000000000000..0f14ef376455f5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-indobert_base_p2_finetuned_mer_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English indobert_base_p2_finetuned_mer BertEmbeddings from stevenwh +author: John Snow Labs +name: indobert_base_p2_finetuned_mer +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indobert_base_p2_finetuned_mer` is a English model originally trained by stevenwh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indobert_base_p2_finetuned_mer_en_5.1.1_3.0_1694629744633.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indobert_base_p2_finetuned_mer_en_5.1.1_3.0_1694629744633.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("indobert_base_p2_finetuned_mer","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("indobert_base_p2_finetuned_mer", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indobert_base_p2_finetuned_mer| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|464.1 MB| + +## References + +https://huggingface.co/stevenwh/indobert-base-p2-finetuned-mer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-jobbert_base_cased_en.md b/docs/_posts/ahmedlone127/2023-09-13-jobbert_base_cased_en.md new file mode 100644 index 00000000000000..d189235ed2ed98 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-jobbert_base_cased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English jobbert_base_cased BertEmbeddings from jjzha +author: John Snow Labs +name: jobbert_base_cased +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`jobbert_base_cased` is a English model originally trained by jjzha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/jobbert_base_cased_en_5.1.1_3.0_1694631841420.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/jobbert_base_cased_en_5.1.1_3.0_1694631841420.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("jobbert_base_cased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("jobbert_base_cased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|jobbert_base_cased| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|402.2 MB| + +## References + +https://huggingface.co/jjzha/jobbert-base-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-kannada_bert_kn.md b/docs/_posts/ahmedlone127/2023-09-13-kannada_bert_kn.md new file mode 100644 index 00000000000000..d2953cc379d955 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-kannada_bert_kn.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Kannada kannada_bert BertEmbeddings from l3cube-pune +author: John Snow Labs +name: kannada_bert +date: 2023-09-13 +tags: [bert, kn, open_source, fill_mask, onnx] +task: Embeddings +language: kn +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kannada_bert` is a Kannada model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kannada_bert_kn_5.1.1_3.0_1694638806956.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kannada_bert_kn_5.1.1_3.0_1694638806956.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("kannada_bert","kn") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("kannada_bert", "kn") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kannada_bert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|kn| +|Size:|890.5 MB| + +## References + +https://huggingface.co/l3cube-pune/kannada-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-kc_900_en.md b/docs/_posts/ahmedlone127/2023-09-13-kc_900_en.md new file mode 100644 index 00000000000000..105e0c7462f8a5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-kc_900_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English kc_900 BertEmbeddings from erica +author: John Snow Labs +name: kc_900 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kc_900` is a English model originally trained by erica. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kc_900_en_5.1.1_3.0_1694633305275.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kc_900_en_5.1.1_3.0_1694633305275.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("kc_900","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("kc_900", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kc_900| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/erica/kc_900 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-kcbase400_en.md b/docs/_posts/ahmedlone127/2023-09-13-kcbase400_en.md new file mode 100644 index 00000000000000..1e7a615c41a4b5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-kcbase400_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English kcbase400 BertEmbeddings from erica +author: John Snow Labs +name: kcbase400 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kcbase400` is a English model originally trained by erica. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kcbase400_en_5.1.1_3.0_1694633739043.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kcbase400_en_5.1.1_3.0_1694633739043.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("kcbase400","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("kcbase400", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kcbase400| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|406.3 MB| + +## References + +https://huggingface.co/erica/kcbase400 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-kcbert_base_finetuned_en.md b/docs/_posts/ahmedlone127/2023-09-13-kcbert_base_finetuned_en.md new file mode 100644 index 00000000000000..eab51b72428795 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-kcbert_base_finetuned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English kcbert_base_finetuned BertEmbeddings from eno3940 +author: John Snow Labs +name: kcbert_base_finetuned +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kcbert_base_finetuned` is a English model originally trained by eno3940. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kcbert_base_finetuned_en_5.1.1_3.0_1694649289545.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kcbert_base_finetuned_en_5.1.1_3.0_1694649289545.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("kcbert_base_finetuned","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("kcbert_base_finetuned", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kcbert_base_finetuned| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|406.2 MB| + +## References + +https://huggingface.co/eno3940/kcbert-base-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-kcbert_large_finetuned_en.md b/docs/_posts/ahmedlone127/2023-09-13-kcbert_large_finetuned_en.md new file mode 100644 index 00000000000000..6ea04c1b9af09d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-kcbert_large_finetuned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English kcbert_large_finetuned BertEmbeddings from LoraBaek +author: John Snow Labs +name: kcbert_large_finetuned +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kcbert_large_finetuned` is a English model originally trained by LoraBaek. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kcbert_large_finetuned_en_5.1.1_3.0_1694644868795.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kcbert_large_finetuned_en_5.1.1_3.0_1694644868795.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("kcbert_large_finetuned","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("kcbert_large_finetuned", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kcbert_large_finetuned| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/LoraBaek/kcbert-large-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-klue_base_finetuned_en.md b/docs/_posts/ahmedlone127/2023-09-13-klue_base_finetuned_en.md new file mode 100644 index 00000000000000..b8032d3fd51c7d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-klue_base_finetuned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English klue_base_finetuned BertEmbeddings from eno3940 +author: John Snow Labs +name: klue_base_finetuned +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`klue_base_finetuned` is a English model originally trained by eno3940. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/klue_base_finetuned_en_5.1.1_3.0_1694646857491.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/klue_base_finetuned_en_5.1.1_3.0_1694646857491.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("klue_base_finetuned","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("klue_base_finetuned", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|klue_base_finetuned| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|412.4 MB| + +## References + +https://huggingface.co/eno3940/klue-base-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-klue_bert_epoch3_en.md b/docs/_posts/ahmedlone127/2023-09-13-klue_bert_epoch3_en.md new file mode 100644 index 00000000000000..bd821e3e444e85 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-klue_bert_epoch3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English klue_bert_epoch3 BertEmbeddings from eno3940 +author: John Snow Labs +name: klue_bert_epoch3 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`klue_bert_epoch3` is a English model originally trained by eno3940. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/klue_bert_epoch3_en_5.1.1_3.0_1694648970288.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/klue_bert_epoch3_en_5.1.1_3.0_1694648970288.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("klue_bert_epoch3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("klue_bert_epoch3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|klue_bert_epoch3| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|412.4 MB| + +## References + +https://huggingface.co/eno3940/klue-bert-epoch3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-ksl_bert_en.md b/docs/_posts/ahmedlone127/2023-09-13-ksl_bert_en.md new file mode 100644 index 00000000000000..02249500c2f5a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-ksl_bert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ksl_bert BertEmbeddings from dobbytk +author: John Snow Labs +name: ksl_bert +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ksl_bert` is a English model originally trained by dobbytk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ksl_bert_en_5.1.1_3.0_1694626325742.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ksl_bert_en_5.1.1_3.0_1694626325742.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("ksl_bert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("ksl_bert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ksl_bert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|406.2 MB| + +## References + +https://huggingface.co/dobbytk/KSL-BERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-legal_hebert_en.md b/docs/_posts/ahmedlone127/2023-09-13-legal_hebert_en.md new file mode 100644 index 00000000000000..508141215a8c84 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-legal_hebert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English legal_hebert BertEmbeddings from avichr +author: John Snow Labs +name: legal_hebert +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`legal_hebert` is a English model originally trained by avichr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/legal_hebert_en_5.1.1_3.0_1694641249619.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/legal_hebert_en_5.1.1_3.0_1694641249619.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("legal_hebert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("legal_hebert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|legal_hebert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|462.5 MB| + +## References + +https://huggingface.co/avichr/Legal-heBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-legal_indobert_pytorch_en.md b/docs/_posts/ahmedlone127/2023-09-13-legal_indobert_pytorch_en.md new file mode 100644 index 00000000000000..40968056fea7e2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-legal_indobert_pytorch_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English legal_indobert_pytorch BertEmbeddings from kapanjagocoding +author: John Snow Labs +name: legal_indobert_pytorch +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`legal_indobert_pytorch` is a English model originally trained by kapanjagocoding. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/legal_indobert_pytorch_en_5.1.1_3.0_1694636800311.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/legal_indobert_pytorch_en_5.1.1_3.0_1694636800311.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("legal_indobert_pytorch","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("legal_indobert_pytorch", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|legal_indobert_pytorch| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/kapanjagocoding/legal-indobert-pytorch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-legal_indobert_pytorch_v1_en.md b/docs/_posts/ahmedlone127/2023-09-13-legal_indobert_pytorch_v1_en.md new file mode 100644 index 00000000000000..446b310aff75d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-legal_indobert_pytorch_v1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English legal_indobert_pytorch_v1 BertEmbeddings from kapanjagocoding +author: John Snow Labs +name: legal_indobert_pytorch_v1 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`legal_indobert_pytorch_v1` is a English model originally trained by kapanjagocoding. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/legal_indobert_pytorch_v1_en_5.1.1_3.0_1694637779586.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/legal_indobert_pytorch_v1_en_5.1.1_3.0_1694637779586.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("legal_indobert_pytorch_v1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("legal_indobert_pytorch_v1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|legal_indobert_pytorch_v1| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/kapanjagocoding/legal-indobert-pytorch-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-legal_indobert_pytorch_v2_en.md b/docs/_posts/ahmedlone127/2023-09-13-legal_indobert_pytorch_v2_en.md new file mode 100644 index 00000000000000..dca64da8b73945 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-legal_indobert_pytorch_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English legal_indobert_pytorch_v2 BertEmbeddings from kapanjagocoding +author: John Snow Labs +name: legal_indobert_pytorch_v2 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`legal_indobert_pytorch_v2` is a English model originally trained by kapanjagocoding. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/legal_indobert_pytorch_v2_en_5.1.1_3.0_1694637348680.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/legal_indobert_pytorch_v2_en_5.1.1_3.0_1694637348680.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("legal_indobert_pytorch_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("legal_indobert_pytorch_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|legal_indobert_pytorch_v2| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/kapanjagocoding/legal-indobert-pytorch-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-legal_indobert_pytorch_v3_en.md b/docs/_posts/ahmedlone127/2023-09-13-legal_indobert_pytorch_v3_en.md new file mode 100644 index 00000000000000..05faa53af7c6ca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-legal_indobert_pytorch_v3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English legal_indobert_pytorch_v3 BertEmbeddings from kapanjagocoding +author: John Snow Labs +name: legal_indobert_pytorch_v3 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`legal_indobert_pytorch_v3` is a English model originally trained by kapanjagocoding. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/legal_indobert_pytorch_v3_en_5.1.1_3.0_1694638150002.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/legal_indobert_pytorch_v3_en_5.1.1_3.0_1694638150002.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("legal_indobert_pytorch_v3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("legal_indobert_pytorch_v3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|legal_indobert_pytorch_v3| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/kapanjagocoding/legal-indobert-pytorch-v3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-legal_indobert_pytorch_v4_en.md b/docs/_posts/ahmedlone127/2023-09-13-legal_indobert_pytorch_v4_en.md new file mode 100644 index 00000000000000..3c192c002de763 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-legal_indobert_pytorch_v4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English legal_indobert_pytorch_v4 BertEmbeddings from kapanjagocoding +author: John Snow Labs +name: legal_indobert_pytorch_v4 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`legal_indobert_pytorch_v4` is a English model originally trained by kapanjagocoding. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/legal_indobert_pytorch_v4_en_5.1.1_3.0_1694643325435.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/legal_indobert_pytorch_v4_en_5.1.1_3.0_1694643325435.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("legal_indobert_pytorch_v4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("legal_indobert_pytorch_v4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|legal_indobert_pytorch_v4| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/kapanjagocoding/legal-indobert-pytorch-v4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-lht_bert_512_en.md b/docs/_posts/ahmedlone127/2023-09-13-lht_bert_512_en.md new file mode 100644 index 00000000000000..8e6a0c237b6dc7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-lht_bert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English lht_bert_512 BertEmbeddings from Shanny +author: John Snow Labs +name: lht_bert_512 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`lht_bert_512` is a English model originally trained by Shanny. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/lht_bert_512_en_5.1.1_3.0_1694635559814.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/lht_bert_512_en_5.1.1_3.0_1694635559814.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("lht_bert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("lht_bert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|lht_bert_512| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/Shanny/LHT_BERT_512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-lht_bert_added_tokens_en.md b/docs/_posts/ahmedlone127/2023-09-13-lht_bert_added_tokens_en.md new file mode 100644 index 00000000000000..468f1bba828a96 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-lht_bert_added_tokens_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English lht_bert_added_tokens BertEmbeddings from Shanny +author: John Snow Labs +name: lht_bert_added_tokens +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`lht_bert_added_tokens` is a English model originally trained by Shanny. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/lht_bert_added_tokens_en_5.1.1_3.0_1694630512751.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/lht_bert_added_tokens_en_5.1.1_3.0_1694630512751.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("lht_bert_added_tokens","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("lht_bert_added_tokens", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|lht_bert_added_tokens| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/Shanny/LHT_BERT_added_tokens \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-lht_bert_added_words_en.md b/docs/_posts/ahmedlone127/2023-09-13-lht_bert_added_words_en.md new file mode 100644 index 00000000000000..d52f808bf7aad1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-lht_bert_added_words_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English lht_bert_added_words BertEmbeddings from Shanny +author: John Snow Labs +name: lht_bert_added_words +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`lht_bert_added_words` is a English model originally trained by Shanny. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/lht_bert_added_words_en_5.1.1_3.0_1694629369350.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/lht_bert_added_words_en_5.1.1_3.0_1694629369350.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("lht_bert_added_words","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("lht_bert_added_words", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|lht_bert_added_words| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/Shanny/LHT_BERT_added_words \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-lht_bert_en.md b/docs/_posts/ahmedlone127/2023-09-13-lht_bert_en.md new file mode 100644 index 00000000000000..4b569d0297cd15 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-lht_bert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English lht_bert BertEmbeddings from Shanny +author: John Snow Labs +name: lht_bert +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`lht_bert` is a English model originally trained by Shanny. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/lht_bert_en_5.1.1_3.0_1694621223965.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/lht_bert_en_5.1.1_3.0_1694621223965.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("lht_bert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("lht_bert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|lht_bert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/Shanny/LHT_BERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-linkbert_base_en.md b/docs/_posts/ahmedlone127/2023-09-13-linkbert_base_en.md new file mode 100644 index 00000000000000..23ffaa7b372c74 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-linkbert_base_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English linkbert_base BertEmbeddings from michiyasunaga +author: John Snow Labs +name: linkbert_base +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`linkbert_base` is a English model originally trained by michiyasunaga. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/linkbert_base_en_5.1.1_3.0_1694606637209.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/linkbert_base_en_5.1.1_3.0_1694606637209.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("linkbert_base","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("linkbert_base", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|linkbert_base| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.4 MB| + +## References + +https://huggingface.co/michiyasunaga/LinkBERT-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-linkbert_large_en.md b/docs/_posts/ahmedlone127/2023-09-13-linkbert_large_en.md new file mode 100644 index 00000000000000..8c6be65db73918 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-linkbert_large_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English linkbert_large BertEmbeddings from michiyasunaga +author: John Snow Labs +name: linkbert_large +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`linkbert_large` is a English model originally trained by michiyasunaga. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/linkbert_large_en_5.1.1_3.0_1694605675771.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/linkbert_large_en_5.1.1_3.0_1694605675771.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("linkbert_large","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("linkbert_large", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|linkbert_large| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/michiyasunaga/LinkBERT-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-luxembert_en.md b/docs/_posts/ahmedlone127/2023-09-13-luxembert_en.md new file mode 100644 index 00000000000000..390bb1ff2d62f0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-luxembert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English luxembert BertEmbeddings from lothritz +author: John Snow Labs +name: luxembert +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`luxembert` is a English model originally trained by lothritz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/luxembert_en_5.1.1_3.0_1694643972873.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/luxembert_en_5.1.1_3.0_1694643972873.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("luxembert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("luxembert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|luxembert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|405.7 MB| + +## References + +https://huggingface.co/lothritz/LuxemBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-mabel_bert_base_uncased_en.md b/docs/_posts/ahmedlone127/2023-09-13-mabel_bert_base_uncased_en.md new file mode 100644 index 00000000000000..08dec8669262a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-mabel_bert_base_uncased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English mabel_bert_base_uncased BertEmbeddings from princeton-nlp +author: John Snow Labs +name: mabel_bert_base_uncased +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mabel_bert_base_uncased` is a English model originally trained by princeton-nlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mabel_bert_base_uncased_en_5.1.1_3.0_1694611081845.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mabel_bert_base_uncased_en_5.1.1_3.0_1694611081845.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("mabel_bert_base_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("mabel_bert_base_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mabel_bert_base_uncased| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/princeton-nlp/mabel-bert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-malayalam_bert_ml.md b/docs/_posts/ahmedlone127/2023-09-13-malayalam_bert_ml.md new file mode 100644 index 00000000000000..41ea31fd981233 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-malayalam_bert_ml.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Malayalam malayalam_bert BertEmbeddings from l3cube-pune +author: John Snow Labs +name: malayalam_bert +date: 2023-09-13 +tags: [bert, ml, open_source, fill_mask, onnx] +task: Embeddings +language: ml +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`malayalam_bert` is a Malayalam model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/malayalam_bert_ml_5.1.1_3.0_1694640553943.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/malayalam_bert_ml_5.1.1_3.0_1694640553943.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("malayalam_bert","ml") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("malayalam_bert", "ml") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|malayalam_bert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|ml| +|Size:|890.5 MB| + +## References + +https://huggingface.co/l3cube-pune/malayalam-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-manubert_en.md b/docs/_posts/ahmedlone127/2023-09-13-manubert_en.md new file mode 100644 index 00000000000000..d0d6b24ffcc2f7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-manubert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English manubert BertEmbeddings from akumar33 +author: John Snow Labs +name: manubert +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`manubert` is a English model originally trained by akumar33. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/manubert_en_5.1.1_3.0_1694648045075.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/manubert_en_5.1.1_3.0_1694648045075.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("manubert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("manubert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|manubert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/akumar33/ManuBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-marathi_bert_v2_mr.md b/docs/_posts/ahmedlone127/2023-09-13-marathi_bert_v2_mr.md new file mode 100644 index 00000000000000..f4caa9d6aa76ff --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-marathi_bert_v2_mr.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Marathi marathi_bert_v2 BertEmbeddings from l3cube-pune +author: John Snow Labs +name: marathi_bert_v2 +date: 2023-09-13 +tags: [bert, mr, open_source, fill_mask, onnx] +task: Embeddings +language: mr +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`marathi_bert_v2` is a Marathi model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/marathi_bert_v2_mr_5.1.1_3.0_1694573439496.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/marathi_bert_v2_mr_5.1.1_3.0_1694573439496.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("marathi_bert_v2","mr") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("marathi_bert_v2", "mr") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|marathi_bert_v2| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|mr| +|Size:|890.6 MB| + +## References + +https://huggingface.co/l3cube-pune/marathi-bert-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-marathi_tweets_bert_scratch_mr.md b/docs/_posts/ahmedlone127/2023-09-13-marathi_tweets_bert_scratch_mr.md new file mode 100644 index 00000000000000..23d5443914b7c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-marathi_tweets_bert_scratch_mr.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Marathi marathi_tweets_bert_scratch BertEmbeddings from l3cube-pune +author: John Snow Labs +name: marathi_tweets_bert_scratch +date: 2023-09-13 +tags: [bert, mr, open_source, fill_mask, onnx] +task: Embeddings +language: mr +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`marathi_tweets_bert_scratch` is a Marathi model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/marathi_tweets_bert_scratch_mr_5.1.1_3.0_1694631845235.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/marathi_tweets_bert_scratch_mr_5.1.1_3.0_1694631845235.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("marathi_tweets_bert_scratch","mr") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("marathi_tweets_bert_scratch", "mr") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|marathi_tweets_bert_scratch| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|mr| +|Size:|470.4 MB| + +## References + +https://huggingface.co/l3cube-pune/marathi-tweets-bert-scratch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-marbertv2_ar.md b/docs/_posts/ahmedlone127/2023-09-13-marbertv2_ar.md new file mode 100644 index 00000000000000..0ae0523e7cfbf1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-marbertv2_ar.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Arabic marbertv2 BertEmbeddings from UBC-NLP +author: John Snow Labs +name: marbertv2 +date: 2023-09-13 +tags: [bert, ar, open_source, fill_mask, onnx] +task: Embeddings +language: ar +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`marbertv2` is a Arabic model originally trained by UBC-NLP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/marbertv2_ar_5.1.1_3.0_1694574149293.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/marbertv2_ar_5.1.1_3.0_1694574149293.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("marbertv2","ar") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("marbertv2", "ar") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|marbertv2| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|ar| +|Size:|606.5 MB| + +## References + +https://huggingface.co/UBC-NLP/MARBERTv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-mbert_deen_en.md b/docs/_posts/ahmedlone127/2023-09-13-mbert_deen_en.md new file mode 100644 index 00000000000000..62c910323cef2d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-mbert_deen_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English mbert_deen BertEmbeddings from miugod +author: John Snow Labs +name: mbert_deen +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mbert_deen` is a English model originally trained by miugod. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mbert_deen_en_5.1.1_3.0_1694644088821.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mbert_deen_en_5.1.1_3.0_1694644088821.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("mbert_deen","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("mbert_deen", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mbert_deen| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.1 MB| + +## References + +https://huggingface.co/miugod/mbert_deen \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-mbertu_mt.md b/docs/_posts/ahmedlone127/2023-09-13-mbertu_mt.md new file mode 100644 index 00000000000000..dbdf2ab7dbdd6d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-mbertu_mt.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Maltese mbertu BertEmbeddings from MLRS +author: John Snow Labs +name: mbertu +date: 2023-09-13 +tags: [bert, mt, open_source, fill_mask, onnx] +task: Embeddings +language: mt +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mbertu` is a Maltese model originally trained by MLRS. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mbertu_mt_5.1.1_3.0_1694635757980.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mbertu_mt_5.1.1_3.0_1694635757980.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("mbertu","mt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("mbertu", "mt") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mbertu| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|mt| +|Size:|664.5 MB| + +## References + +https://huggingface.co/MLRS/mBERTu \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-me_bert_mixed_mr.md b/docs/_posts/ahmedlone127/2023-09-13-me_bert_mixed_mr.md new file mode 100644 index 00000000000000..08a957310142ca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-me_bert_mixed_mr.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Marathi me_bert_mixed BertEmbeddings from l3cube-pune +author: John Snow Labs +name: me_bert_mixed +date: 2023-09-13 +tags: [bert, mr, open_source, fill_mask, onnx] +task: Embeddings +language: mr +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`me_bert_mixed` is a Marathi model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/me_bert_mixed_mr_5.1.1_3.0_1694647549621.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/me_bert_mixed_mr_5.1.1_3.0_1694647549621.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("me_bert_mixed","mr") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("me_bert_mixed", "mr") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|me_bert_mixed| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|mr| +|Size:|665.0 MB| + +## References + +https://huggingface.co/l3cube-pune/me-bert-mixed \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-me_bert_mr.md b/docs/_posts/ahmedlone127/2023-09-13-me_bert_mr.md new file mode 100644 index 00000000000000..c2e61fe5bcacd7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-me_bert_mr.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Marathi me_bert BertEmbeddings from l3cube-pune +author: John Snow Labs +name: me_bert +date: 2023-09-13 +tags: [bert, mr, open_source, fill_mask, onnx] +task: Embeddings +language: mr +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`me_bert` is a Marathi model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/me_bert_mr_5.1.1_3.0_1694646862410.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/me_bert_mr_5.1.1_3.0_1694646862410.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("me_bert","mr") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("me_bert", "mr") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|me_bert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|mr| +|Size:|407.2 MB| + +## References + +https://huggingface.co/l3cube-pune/me-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-medical_bio_bert2_en.md b/docs/_posts/ahmedlone127/2023-09-13-medical_bio_bert2_en.md new file mode 100644 index 00000000000000..ef12b2cfe94345 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-medical_bio_bert2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English medical_bio_bert2 BertEmbeddings from fspanda +author: John Snow Labs +name: medical_bio_bert2 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`medical_bio_bert2` is a English model originally trained by fspanda. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/medical_bio_bert2_en_5.1.1_3.0_1694647435608.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/medical_bio_bert2_en_5.1.1_3.0_1694647435608.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("medical_bio_bert2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("medical_bio_bert2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|medical_bio_bert2| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|406.7 MB| + +## References + +https://huggingface.co/fspanda/Medical-Bio-BERT2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-metaphor_finetuned_bert_en.md b/docs/_posts/ahmedlone127/2023-09-13-metaphor_finetuned_bert_en.md new file mode 100644 index 00000000000000..f67dea2a490de9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-metaphor_finetuned_bert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English metaphor_finetuned_bert BertEmbeddings from kangela +author: John Snow Labs +name: metaphor_finetuned_bert +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`metaphor_finetuned_bert` is a English model originally trained by kangela. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/metaphor_finetuned_bert_en_5.1.1_3.0_1694633662658.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/metaphor_finetuned_bert_en_5.1.1_3.0_1694633662658.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("metaphor_finetuned_bert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("metaphor_finetuned_bert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|metaphor_finetuned_bert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/kangela/Metaphor-FineTuned-BERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-mlm_20230404_002_1_en.md b/docs/_posts/ahmedlone127/2023-09-13-mlm_20230404_002_1_en.md new file mode 100644 index 00000000000000..8c5b0f1c67eb01 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-mlm_20230404_002_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English mlm_20230404_002_1 BertEmbeddings from intanm +author: John Snow Labs +name: mlm_20230404_002_1 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mlm_20230404_002_1` is a English model originally trained by intanm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mlm_20230404_002_1_en_5.1.1_3.0_1694610217733.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mlm_20230404_002_1_en_5.1.1_3.0_1694610217733.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("mlm_20230404_002_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("mlm_20230404_002_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mlm_20230404_002_1| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|464.2 MB| + +## References + +https://huggingface.co/intanm/mlm-20230404-002-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-mlm_20230405_002_2_en.md b/docs/_posts/ahmedlone127/2023-09-13-mlm_20230405_002_2_en.md new file mode 100644 index 00000000000000..3a0525308de1db --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-mlm_20230405_002_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English mlm_20230405_002_2 BertEmbeddings from intanm +author: John Snow Labs +name: mlm_20230405_002_2 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mlm_20230405_002_2` is a English model originally trained by intanm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mlm_20230405_002_2_en_5.1.1_3.0_1694611265207.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mlm_20230405_002_2_en_5.1.1_3.0_1694611265207.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("mlm_20230405_002_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("mlm_20230405_002_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mlm_20230405_002_2| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|464.2 MB| + +## References + +https://huggingface.co/intanm/mlm-20230405-002-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-mlm_20230405_002_3_en.md b/docs/_posts/ahmedlone127/2023-09-13-mlm_20230405_002_3_en.md new file mode 100644 index 00000000000000..e5a6ceb1a506fc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-mlm_20230405_002_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English mlm_20230405_002_3 BertEmbeddings from intanm +author: John Snow Labs +name: mlm_20230405_002_3 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mlm_20230405_002_3` is a English model originally trained by intanm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mlm_20230405_002_3_en_5.1.1_3.0_1694612950326.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mlm_20230405_002_3_en_5.1.1_3.0_1694612950326.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("mlm_20230405_002_3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("mlm_20230405_002_3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mlm_20230405_002_3| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|464.2 MB| + +## References + +https://huggingface.co/intanm/mlm-20230405-002-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-mlm_20230405_002_4_en.md b/docs/_posts/ahmedlone127/2023-09-13-mlm_20230405_002_4_en.md new file mode 100644 index 00000000000000..67725b1e5c7263 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-mlm_20230405_002_4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English mlm_20230405_002_4 BertEmbeddings from intanm +author: John Snow Labs +name: mlm_20230405_002_4 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mlm_20230405_002_4` is a English model originally trained by intanm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mlm_20230405_002_4_en_5.1.1_3.0_1694614086684.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mlm_20230405_002_4_en_5.1.1_3.0_1694614086684.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("mlm_20230405_002_4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("mlm_20230405_002_4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mlm_20230405_002_4| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|464.2 MB| + +## References + +https://huggingface.co/intanm/mlm-20230405-002-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-mlm_20230406_002_5_en.md b/docs/_posts/ahmedlone127/2023-09-13-mlm_20230406_002_5_en.md new file mode 100644 index 00000000000000..443a54d364e608 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-mlm_20230406_002_5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English mlm_20230406_002_5 BertEmbeddings from intanm +author: John Snow Labs +name: mlm_20230406_002_5 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mlm_20230406_002_5` is a English model originally trained by intanm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mlm_20230406_002_5_en_5.1.1_3.0_1694614626559.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mlm_20230406_002_5_en_5.1.1_3.0_1694614626559.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("mlm_20230406_002_5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("mlm_20230406_002_5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mlm_20230406_002_5| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|464.3 MB| + +## References + +https://huggingface.co/intanm/mlm-20230406-002-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-model1_en.md b/docs/_posts/ahmedlone127/2023-09-13-model1_en.md new file mode 100644 index 00000000000000..b4e45db0bc3ce5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-model1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English model1 BertEmbeddings from flymushroom +author: John Snow Labs +name: model1 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`model1` is a English model originally trained by flymushroom. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/model1_en_5.1.1_3.0_1694646795523.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/model1_en_5.1.1_3.0_1694646795523.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("model1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("model1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|model1| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/flymushroom/model1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-mybert_base_32k_en.md b/docs/_posts/ahmedlone127/2023-09-13-mybert_base_32k_en.md new file mode 100644 index 00000000000000..1a251c986969af --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-mybert_base_32k_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English mybert_base_32k BertEmbeddings from maveriq +author: John Snow Labs +name: mybert_base_32k +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mybert_base_32k` is a English model originally trained by maveriq. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mybert_base_32k_en_5.1.1_3.0_1694629972484.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mybert_base_32k_en_5.1.1_3.0_1694629972484.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("mybert_base_32k","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("mybert_base_32k", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mybert_base_32k| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|408.4 MB| + +## References + +https://huggingface.co/maveriq/mybert-base-32k \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-mybert_mini_172k_en.md b/docs/_posts/ahmedlone127/2023-09-13-mybert_mini_172k_en.md new file mode 100644 index 00000000000000..ca9d03fc99c778 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-mybert_mini_172k_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English mybert_mini_172k BertEmbeddings from maveriq +author: John Snow Labs +name: mybert_mini_172k +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mybert_mini_172k` is a English model originally trained by maveriq. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mybert_mini_172k_en_5.1.1_3.0_1694630527630.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mybert_mini_172k_en_5.1.1_3.0_1694630527630.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("mybert_mini_172k","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("mybert_mini_172k", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mybert_mini_172k| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|42.0 MB| + +## References + +https://huggingface.co/maveriq/mybert-mini-172k \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-mybert_mini_1m_en.md b/docs/_posts/ahmedlone127/2023-09-13-mybert_mini_1m_en.md new file mode 100644 index 00000000000000..b9603b419e6d4d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-mybert_mini_1m_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English mybert_mini_1m BertEmbeddings from maveriq +author: John Snow Labs +name: mybert_mini_1m +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mybert_mini_1m` is a English model originally trained by maveriq. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mybert_mini_1m_en_5.1.1_3.0_1694640791254.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mybert_mini_1m_en_5.1.1_3.0_1694640791254.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("mybert_mini_1m","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("mybert_mini_1m", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mybert_mini_1m| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|42.0 MB| + +## References + +https://huggingface.co/maveriq/mybert-mini-1M \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-mybert_mini_500k_en.md b/docs/_posts/ahmedlone127/2023-09-13-mybert_mini_500k_en.md new file mode 100644 index 00000000000000..2f196e297605db --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-mybert_mini_500k_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English mybert_mini_500k BertEmbeddings from maveriq +author: John Snow Labs +name: mybert_mini_500k +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mybert_mini_500k` is a English model originally trained by maveriq. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mybert_mini_500k_en_5.1.1_3.0_1694640484187.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mybert_mini_500k_en_5.1.1_3.0_1694640484187.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("mybert_mini_500k","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("mybert_mini_500k", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mybert_mini_500k| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|42.1 MB| + +## References + +https://huggingface.co/maveriq/mybert-mini-500k \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-mymode03_en.md b/docs/_posts/ahmedlone127/2023-09-13-mymode03_en.md new file mode 100644 index 00000000000000..79221a5937bc9a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-mymode03_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English mymode03 BertEmbeddings from wbmitcast +author: John Snow Labs +name: mymode03 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mymode03` is a English model originally trained by wbmitcast. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mymode03_en_5.1.1_3.0_1694583211320.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mymode03_en_5.1.1_3.0_1694583211320.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("mymode03","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("mymode03", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mymode03| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/wbmitcast/mymode03 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-nbme_bio_clinicalbert_en.md b/docs/_posts/ahmedlone127/2023-09-13-nbme_bio_clinicalbert_en.md new file mode 100644 index 00000000000000..a8752a06674e57 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-nbme_bio_clinicalbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English nbme_bio_clinicalbert BertEmbeddings from smeoni +author: John Snow Labs +name: nbme_bio_clinicalbert +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nbme_bio_clinicalbert` is a English model originally trained by smeoni. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nbme_bio_clinicalbert_en_5.1.1_3.0_1694632921114.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nbme_bio_clinicalbert_en_5.1.1_3.0_1694632921114.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("nbme_bio_clinicalbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("nbme_bio_clinicalbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nbme_bio_clinicalbert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.2 MB| + +## References + +https://huggingface.co/smeoni/nbme-Bio_ClinicalBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-news_pretrain_bert_en.md b/docs/_posts/ahmedlone127/2023-09-13-news_pretrain_bert_en.md new file mode 100644 index 00000000000000..811a7e392add61 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-news_pretrain_bert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English news_pretrain_bert BertEmbeddings from AnonymousSub +author: John Snow Labs +name: news_pretrain_bert +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`news_pretrain_bert` is a English model originally trained by AnonymousSub. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/news_pretrain_bert_en_5.1.1_3.0_1694625096290.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/news_pretrain_bert_en_5.1.1_3.0_1694625096290.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("news_pretrain_bert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("news_pretrain_bert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|news_pretrain_bert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/AnonymousSub/news-pretrain-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-odia_bert_or.md b/docs/_posts/ahmedlone127/2023-09-13-odia_bert_or.md new file mode 100644 index 00000000000000..d68febbe294480 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-odia_bert_or.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Oriya (macrolanguage) odia_bert BertEmbeddings from l3cube-pune +author: John Snow Labs +name: odia_bert +date: 2023-09-13 +tags: [bert, or, open_source, fill_mask, onnx] +task: Embeddings +language: or +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`odia_bert` is a Oriya (macrolanguage) model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/odia_bert_or_5.1.1_3.0_1694643840678.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/odia_bert_or_5.1.1_3.0_1694643840678.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("odia_bert","or") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("odia_bert", "or") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|odia_bert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|or| +|Size:|890.4 MB| + +## References + +https://huggingface.co/l3cube-pune/odia-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-olm_bert_base_uncased_oct_2022_en.md b/docs/_posts/ahmedlone127/2023-09-13-olm_bert_base_uncased_oct_2022_en.md new file mode 100644 index 00000000000000..b56ea2b2160904 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-olm_bert_base_uncased_oct_2022_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English olm_bert_base_uncased_oct_2022 BertEmbeddings from Tristan +author: John Snow Labs +name: olm_bert_base_uncased_oct_2022 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`olm_bert_base_uncased_oct_2022` is a English model originally trained by Tristan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/olm_bert_base_uncased_oct_2022_en_5.1.1_3.0_1694626342322.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/olm_bert_base_uncased_oct_2022_en_5.1.1_3.0_1694626342322.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("olm_bert_base_uncased_oct_2022","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("olm_bert_base_uncased_oct_2022", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|olm_bert_base_uncased_oct_2022| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|464.6 MB| + +## References + +https://huggingface.co/Tristan/olm-bert-base-uncased-oct-2022 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-parlbert_german_v1_de.md b/docs/_posts/ahmedlone127/2023-09-13-parlbert_german_v1_de.md new file mode 100644 index 00000000000000..6ae3abde88faa7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-parlbert_german_v1_de.md @@ -0,0 +1,93 @@ +--- +layout: model +title: German parlbert_german_v1 BertEmbeddings from chkla +author: John Snow Labs +name: parlbert_german_v1 +date: 2023-09-13 +tags: [bert, de, open_source, fill_mask, onnx] +task: Embeddings +language: de +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`parlbert_german_v1` is a German model originally trained by chkla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/parlbert_german_v1_de_5.1.1_3.0_1694648419060.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/parlbert_german_v1_de_5.1.1_3.0_1694648419060.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("parlbert_german_v1","de") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("parlbert_german_v1", "de") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|parlbert_german_v1| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|de| +|Size:|406.8 MB| + +## References + +https://huggingface.co/chkla/parlbert-german-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-pashto_bert_c_en.md b/docs/_posts/ahmedlone127/2023-09-13-pashto_bert_c_en.md new file mode 100644 index 00000000000000..4370e423b498b6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-pashto_bert_c_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English pashto_bert_c BertEmbeddings from ijazulhaq +author: John Snow Labs +name: pashto_bert_c +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pashto_bert_c` is a English model originally trained by ijazulhaq. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pashto_bert_c_en_5.1.1_3.0_1694629099684.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pashto_bert_c_en_5.1.1_3.0_1694629099684.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("pashto_bert_c","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("pashto_bert_c", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pashto_bert_c| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.3 MB| + +## References + +https://huggingface.co/ijazulhaq/pashto-bert-c \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-patana_chilean_spanish_bert_es.md b/docs/_posts/ahmedlone127/2023-09-13-patana_chilean_spanish_bert_es.md new file mode 100644 index 00000000000000..d244e1f29106cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-patana_chilean_spanish_bert_es.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Castilian, Spanish patana_chilean_spanish_bert BertEmbeddings from dccuchile +author: John Snow Labs +name: patana_chilean_spanish_bert +date: 2023-09-13 +tags: [bert, es, open_source, fill_mask, onnx] +task: Embeddings +language: es +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`patana_chilean_spanish_bert` is a Castilian, Spanish model originally trained by dccuchile. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/patana_chilean_spanish_bert_es_5.1.1_3.0_1694619206082.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/patana_chilean_spanish_bert_es_5.1.1_3.0_1694619206082.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("patana_chilean_spanish_bert","es") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("patana_chilean_spanish_bert", "es") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|patana_chilean_spanish_bert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|es| +|Size:|409.4 MB| + +## References + +https://huggingface.co/dccuchile/patana-chilean-spanish-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-phrase_bert_finetuned_imdb_en.md b/docs/_posts/ahmedlone127/2023-09-13-phrase_bert_finetuned_imdb_en.md new file mode 100644 index 00000000000000..853995c9977ccb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-phrase_bert_finetuned_imdb_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English phrase_bert_finetuned_imdb BertEmbeddings from Sarmila +author: John Snow Labs +name: phrase_bert_finetuned_imdb +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phrase_bert_finetuned_imdb` is a English model originally trained by Sarmila. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phrase_bert_finetuned_imdb_en_5.1.1_3.0_1694641064662.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phrase_bert_finetuned_imdb_en_5.1.1_3.0_1694641064662.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("phrase_bert_finetuned_imdb","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("phrase_bert_finetuned_imdb", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phrase_bert_finetuned_imdb| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/Sarmila/phrase-bert-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-phs_bert_en.md b/docs/_posts/ahmedlone127/2023-09-13-phs_bert_en.md new file mode 100644 index 00000000000000..55e139407e13dc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-phs_bert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English phs_bert BertEmbeddings from publichealthsurveillance +author: John Snow Labs +name: phs_bert +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phs_bert` is a English model originally trained by publichealthsurveillance. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phs_bert_en_5.1.1_3.0_1694631316270.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phs_bert_en_5.1.1_3.0_1694631316270.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("phs_bert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("phs_bert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phs_bert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/publichealthsurveillance/PHS-BERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-pretrained_kyw_e1_en.md b/docs/_posts/ahmedlone127/2023-09-13-pretrained_kyw_e1_en.md new file mode 100644 index 00000000000000..dd4ad5670e1bee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-pretrained_kyw_e1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English pretrained_kyw_e1 BertEmbeddings from shahriargolchin +author: John Snow Labs +name: pretrained_kyw_e1 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pretrained_kyw_e1` is a English model originally trained by shahriargolchin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pretrained_kyw_e1_en_5.1.1_3.0_1694646138393.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pretrained_kyw_e1_en_5.1.1_3.0_1694646138393.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("pretrained_kyw_e1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("pretrained_kyw_e1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pretrained_kyw_e1| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/shahriargolchin/pretrained_kyw_e1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-prompt_ls_english_2_en.md b/docs/_posts/ahmedlone127/2023-09-13-prompt_ls_english_2_en.md new file mode 100644 index 00000000000000..d807d4577d1641 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-prompt_ls_english_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English prompt_ls_english_2 BertEmbeddings from lmvasque +author: John Snow Labs +name: prompt_ls_english_2 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`prompt_ls_english_2` is a English model originally trained by lmvasque. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/prompt_ls_english_2_en_5.1.1_3.0_1694625527257.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/prompt_ls_english_2_en_5.1.1_3.0_1694625527257.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("prompt_ls_english_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("prompt_ls_english_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|prompt_ls_english_2| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|625.5 MB| + +## References + +https://huggingface.co/lmvasque/prompt-ls-en-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-prunedbert_l12_h256_a4_finetuned_en.md b/docs/_posts/ahmedlone127/2023-09-13-prunedbert_l12_h256_a4_finetuned_en.md new file mode 100644 index 00000000000000..7841fe1b778e1c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-prunedbert_l12_h256_a4_finetuned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English prunedbert_l12_h256_a4_finetuned BertEmbeddings from eli4s +author: John Snow Labs +name: prunedbert_l12_h256_a4_finetuned +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`prunedbert_l12_h256_a4_finetuned` is a English model originally trained by eli4s. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/prunedbert_l12_h256_a4_finetuned_en_5.1.1_3.0_1694630185594.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/prunedbert_l12_h256_a4_finetuned_en_5.1.1_3.0_1694630185594.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("prunedbert_l12_h256_a4_finetuned","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("prunedbert_l12_h256_a4_finetuned", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|prunedbert_l12_h256_a4_finetuned| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|111.9 MB| + +## References + +https://huggingface.co/eli4s/prunedBert-L12-h256-A4-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-prunedbert_l12_h384_a6_finetuned_en.md b/docs/_posts/ahmedlone127/2023-09-13-prunedbert_l12_h384_a6_finetuned_en.md new file mode 100644 index 00000000000000..3c9423988f0cf1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-prunedbert_l12_h384_a6_finetuned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English prunedbert_l12_h384_a6_finetuned BertEmbeddings from eli4s +author: John Snow Labs +name: prunedbert_l12_h384_a6_finetuned +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`prunedbert_l12_h384_a6_finetuned` is a English model originally trained by eli4s. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/prunedbert_l12_h384_a6_finetuned_en_5.1.1_3.0_1694630539022.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/prunedbert_l12_h384_a6_finetuned_en_5.1.1_3.0_1694630539022.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("prunedbert_l12_h384_a6_finetuned","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("prunedbert_l12_h384_a6_finetuned", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|prunedbert_l12_h384_a6_finetuned| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|176.4 MB| + +## References + +https://huggingface.co/eli4s/prunedBert-L12-h384-A6-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-punjabi_bert_pa.md b/docs/_posts/ahmedlone127/2023-09-13-punjabi_bert_pa.md new file mode 100644 index 00000000000000..95cba25b441a16 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-punjabi_bert_pa.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Panjabi, Punjabi punjabi_bert BertEmbeddings from l3cube-pune +author: John Snow Labs +name: punjabi_bert +date: 2023-09-13 +tags: [bert, pa, open_source, fill_mask, onnx] +task: Embeddings +language: pa +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`punjabi_bert` is a Panjabi, Punjabi model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/punjabi_bert_pa_5.1.1_3.0_1694645329932.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/punjabi_bert_pa_5.1.1_3.0_1694645329932.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("punjabi_bert","pa") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("punjabi_bert", "pa") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|punjabi_bert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|pa| +|Size:|890.5 MB| + +## References + +https://huggingface.co/l3cube-pune/punjabi-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-pz_bert_kanuri_en.md b/docs/_posts/ahmedlone127/2023-09-13-pz_bert_kanuri_en.md new file mode 100644 index 00000000000000..0f41891579f9cb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-pz_bert_kanuri_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English pz_bert_kanuri BertEmbeddings from Hanwoon +author: John Snow Labs +name: pz_bert_kanuri +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pz_bert_kanuri` is a English model originally trained by Hanwoon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pz_bert_kanuri_en_5.1.1_3.0_1694626706084.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pz_bert_kanuri_en_5.1.1_3.0_1694626706084.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("pz_bert_kanuri","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("pz_bert_kanuri", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pz_bert_kanuri| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|441.1 MB| + +## References + +https://huggingface.co/Hanwoon/pz-bert-kr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-reddit_bert_text2_en.md b/docs/_posts/ahmedlone127/2023-09-13-reddit_bert_text2_en.md new file mode 100644 index 00000000000000..9284efa835afb5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-reddit_bert_text2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English reddit_bert_text2 BertEmbeddings from flboehm +author: John Snow Labs +name: reddit_bert_text2 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`reddit_bert_text2` is a English model originally trained by flboehm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/reddit_bert_text2_en_5.1.1_3.0_1694643257404.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/reddit_bert_text2_en_5.1.1_3.0_1694643257404.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("reddit_bert_text2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("reddit_bert_text2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|reddit_bert_text2| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|414.4 MB| + +## References + +https://huggingface.co/flboehm/reddit-bert-text2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-reddit_bert_text3_en.md b/docs/_posts/ahmedlone127/2023-09-13-reddit_bert_text3_en.md new file mode 100644 index 00000000000000..69533f6cfa486e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-reddit_bert_text3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English reddit_bert_text3 BertEmbeddings from flboehm +author: John Snow Labs +name: reddit_bert_text3 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`reddit_bert_text3` is a English model originally trained by flboehm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/reddit_bert_text3_en_5.1.1_3.0_1694643633284.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/reddit_bert_text3_en_5.1.1_3.0_1694643633284.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("reddit_bert_text3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("reddit_bert_text3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|reddit_bert_text3| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|415.0 MB| + +## References + +https://huggingface.co/flboehm/reddit-bert-text3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-reddit_bert_text4_en.md b/docs/_posts/ahmedlone127/2023-09-13-reddit_bert_text4_en.md new file mode 100644 index 00000000000000..cef44591552f48 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-reddit_bert_text4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English reddit_bert_text4 BertEmbeddings from flboehm +author: John Snow Labs +name: reddit_bert_text4 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`reddit_bert_text4` is a English model originally trained by flboehm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/reddit_bert_text4_en_5.1.1_3.0_1694644221338.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/reddit_bert_text4_en_5.1.1_3.0_1694644221338.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("reddit_bert_text4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("reddit_bert_text4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|reddit_bert_text4| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|412.4 MB| + +## References + +https://huggingface.co/flboehm/reddit-bert-text4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-reddit_bert_text_10_en.md b/docs/_posts/ahmedlone127/2023-09-13-reddit_bert_text_10_en.md new file mode 100644 index 00000000000000..7b568bda23693f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-reddit_bert_text_10_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English reddit_bert_text_10 BertEmbeddings from flboehm +author: John Snow Labs +name: reddit_bert_text_10 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`reddit_bert_text_10` is a English model originally trained by flboehm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/reddit_bert_text_10_en_5.1.1_3.0_1694644674014.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/reddit_bert_text_10_en_5.1.1_3.0_1694644674014.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("reddit_bert_text_10","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("reddit_bert_text_10", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|reddit_bert_text_10| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.8 MB| + +## References + +https://huggingface.co/flboehm/reddit-bert-text_10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-reddit_bert_text_20_en.md b/docs/_posts/ahmedlone127/2023-09-13-reddit_bert_text_20_en.md new file mode 100644 index 00000000000000..4fb483f66bca0f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-reddit_bert_text_20_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English reddit_bert_text_20 BertEmbeddings from flboehm +author: John Snow Labs +name: reddit_bert_text_20 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`reddit_bert_text_20` is a English model originally trained by flboehm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/reddit_bert_text_20_en_5.1.1_3.0_1694645187415.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/reddit_bert_text_20_en_5.1.1_3.0_1694645187415.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("reddit_bert_text_20","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("reddit_bert_text_20", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|reddit_bert_text_20| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.4 MB| + +## References + +https://huggingface.co/flboehm/reddit-bert-text_20 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-reddit_bert_text_5_en.md b/docs/_posts/ahmedlone127/2023-09-13-reddit_bert_text_5_en.md new file mode 100644 index 00000000000000..016c2e00b11dd6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-reddit_bert_text_5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English reddit_bert_text_5 BertEmbeddings from flboehm +author: John Snow Labs +name: reddit_bert_text_5 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`reddit_bert_text_5` is a English model originally trained by flboehm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/reddit_bert_text_5_en_5.1.1_3.0_1694645644655.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/reddit_bert_text_5_en_5.1.1_3.0_1694645644655.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("reddit_bert_text_5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("reddit_bert_text_5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|reddit_bert_text_5| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|408.7 MB| + +## References + +https://huggingface.co/flboehm/reddit-bert-text_5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-roberta_base_culinary_en.md b/docs/_posts/ahmedlone127/2023-09-13-roberta_base_culinary_en.md new file mode 100644 index 00000000000000..102a06c15282b6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-roberta_base_culinary_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English roberta_base_culinary BertEmbeddings from juancavallotti +author: John Snow Labs +name: roberta_base_culinary +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_culinary` is a English model originally trained by juancavallotti. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_culinary_en_5.1.1_3.0_1694642566778.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_culinary_en_5.1.1_3.0_1694642566778.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("roberta_base_culinary","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("roberta_base_culinary", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_culinary| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|405.5 MB| + +## References + +https://huggingface.co/juancavallotti/roberta-base-culinary \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-rubert_base_vet_en.md b/docs/_posts/ahmedlone127/2023-09-13-rubert_base_vet_en.md new file mode 100644 index 00000000000000..a118d7dacbd2a6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-rubert_base_vet_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English rubert_base_vet BertEmbeddings from erasedwalt +author: John Snow Labs +name: rubert_base_vet +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_base_vet` is a English model originally trained by erasedwalt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_base_vet_en_5.1.1_3.0_1694632381694.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_base_vet_en_5.1.1_3.0_1694632381694.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("rubert_base_vet","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("rubert_base_vet", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_base_vet| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|663.8 MB| + +## References + +https://huggingface.co/erasedwalt/rubert-base-vet \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-scinewsbert_en.md b/docs/_posts/ahmedlone127/2023-09-13-scinewsbert_en.md new file mode 100644 index 00000000000000..9d56fd0209afb9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-scinewsbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English scinewsbert BertEmbeddings from psmeros +author: John Snow Labs +name: scinewsbert +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scinewsbert` is a English model originally trained by psmeros. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scinewsbert_en_5.1.1_3.0_1694596298829.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scinewsbert_en_5.1.1_3.0_1694596298829.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("scinewsbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("scinewsbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scinewsbert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/psmeros/SciNewsBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-shangpin_pre_training_en.md b/docs/_posts/ahmedlone127/2023-09-13-shangpin_pre_training_en.md new file mode 100644 index 00000000000000..26ed799f5dfb2e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-shangpin_pre_training_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English shangpin_pre_training BertEmbeddings from nnn +author: John Snow Labs +name: shangpin_pre_training +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`shangpin_pre_training` is a English model originally trained by nnn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/shangpin_pre_training_en_5.1.1_3.0_1694643216154.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/shangpin_pre_training_en_5.1.1_3.0_1694643216154.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("shangpin_pre_training","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("shangpin_pre_training", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|shangpin_pre_training| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|464.6 MB| + +## References + +https://huggingface.co/nnn/shangpin-pre-training \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-small_mlm_glue_wnli_custom_tokenizer_expand_vocab_en.md b/docs/_posts/ahmedlone127/2023-09-13-small_mlm_glue_wnli_custom_tokenizer_expand_vocab_en.md new file mode 100644 index 00000000000000..d92b404f4f39b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-small_mlm_glue_wnli_custom_tokenizer_expand_vocab_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English small_mlm_glue_wnli_custom_tokenizer_expand_vocab BertEmbeddings from muhtasham +author: John Snow Labs +name: small_mlm_glue_wnli_custom_tokenizer_expand_vocab +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`small_mlm_glue_wnli_custom_tokenizer_expand_vocab` is a English model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/small_mlm_glue_wnli_custom_tokenizer_expand_vocab_en_5.1.1_3.0_1694564320620.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/small_mlm_glue_wnli_custom_tokenizer_expand_vocab_en_5.1.1_3.0_1694564320620.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("small_mlm_glue_wnli_custom_tokenizer_expand_vocab","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("small_mlm_glue_wnli_custom_tokenizer_expand_vocab", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|small_mlm_glue_wnli_custom_tokenizer_expand_vocab| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|108.1 MB| + +## References + +https://huggingface.co/muhtasham/small-mlm-glue-wnli-custom-tokenizer-expand-vocab \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-spanish_mlm_03_en.md b/docs/_posts/ahmedlone127/2023-09-13-spanish_mlm_03_en.md new file mode 100644 index 00000000000000..0ec8dd7175ef54 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-spanish_mlm_03_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English spanish_mlm_03 BertEmbeddings from ashwathjadhav23 +author: John Snow Labs +name: spanish_mlm_03 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`spanish_mlm_03` is a English model originally trained by ashwathjadhav23. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/spanish_mlm_03_en_5.1.1_3.0_1694612397384.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/spanish_mlm_03_en_5.1.1_3.0_1694612397384.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("spanish_mlm_03","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("spanish_mlm_03", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|spanish_mlm_03| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.5 MB| + +## References + +https://huggingface.co/ashwathjadhav23/Spanish_MLM_03 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-splade_all_bertnsp_220_en.md b/docs/_posts/ahmedlone127/2023-09-13-splade_all_bertnsp_220_en.md new file mode 100644 index 00000000000000..cb0f6a81171f89 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-splade_all_bertnsp_220_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English splade_all_bertnsp_220 BertEmbeddings from approach0 +author: John Snow Labs +name: splade_all_bertnsp_220 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`splade_all_bertnsp_220` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/splade_all_bertnsp_220_en_5.1.1_3.0_1694634798952.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/splade_all_bertnsp_220_en_5.1.1_3.0_1694634798952.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("splade_all_bertnsp_220","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("splade_all_bertnsp_220", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|splade_all_bertnsp_220| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/approach0/splade_all-bertnsp-220 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-splade_all_cocomae_220_en.md b/docs/_posts/ahmedlone127/2023-09-13-splade_all_cocomae_220_en.md new file mode 100644 index 00000000000000..9381130510091c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-splade_all_cocomae_220_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English splade_all_cocomae_220 BertEmbeddings from approach0 +author: John Snow Labs +name: splade_all_cocomae_220 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`splade_all_cocomae_220` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/splade_all_cocomae_220_en_5.1.1_3.0_1694633376860.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/splade_all_cocomae_220_en_5.1.1_3.0_1694633376860.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("splade_all_cocomae_220","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("splade_all_cocomae_220", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|splade_all_cocomae_220| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/approach0/splade_all-cocomae-220 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-splade_nomath_bertnsp_220_en.md b/docs/_posts/ahmedlone127/2023-09-13-splade_nomath_bertnsp_220_en.md new file mode 100644 index 00000000000000..2bdc74d8db74e3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-splade_nomath_bertnsp_220_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English splade_nomath_bertnsp_220 BertEmbeddings from approach0 +author: John Snow Labs +name: splade_nomath_bertnsp_220 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`splade_nomath_bertnsp_220` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/splade_nomath_bertnsp_220_en_5.1.1_3.0_1694635158811.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/splade_nomath_bertnsp_220_en_5.1.1_3.0_1694635158811.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("splade_nomath_bertnsp_220","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("splade_nomath_bertnsp_220", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|splade_nomath_bertnsp_220| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/approach0/splade_nomath-bertnsp-220 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-splade_nomath_cocomae_220_en.md b/docs/_posts/ahmedlone127/2023-09-13-splade_nomath_cocomae_220_en.md new file mode 100644 index 00000000000000..d90f7a4b5f6236 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-splade_nomath_cocomae_220_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English splade_nomath_cocomae_220 BertEmbeddings from approach0 +author: John Snow Labs +name: splade_nomath_cocomae_220 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`splade_nomath_cocomae_220` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/splade_nomath_cocomae_220_en_5.1.1_3.0_1694634002866.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/splade_nomath_cocomae_220_en_5.1.1_3.0_1694634002866.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("splade_nomath_cocomae_220","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("splade_nomath_cocomae_220", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|splade_nomath_cocomae_220| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/approach0/splade_nomath-cocomae-220 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-splade_somemath_bertnsp_220_en.md b/docs/_posts/ahmedlone127/2023-09-13-splade_somemath_bertnsp_220_en.md new file mode 100644 index 00000000000000..263c43c9c0f988 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-splade_somemath_bertnsp_220_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English splade_somemath_bertnsp_220 BertEmbeddings from approach0 +author: John Snow Labs +name: splade_somemath_bertnsp_220 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`splade_somemath_bertnsp_220` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/splade_somemath_bertnsp_220_en_5.1.1_3.0_1694635770562.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/splade_somemath_bertnsp_220_en_5.1.1_3.0_1694635770562.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("splade_somemath_bertnsp_220","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("splade_somemath_bertnsp_220", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|splade_somemath_bertnsp_220| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/approach0/splade_somemath-bertnsp-220 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-splade_somemath_cocomae_220_en.md b/docs/_posts/ahmedlone127/2023-09-13-splade_somemath_cocomae_220_en.md new file mode 100644 index 00000000000000..3d1debffa80e64 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-splade_somemath_cocomae_220_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English splade_somemath_cocomae_220 BertEmbeddings from approach0 +author: John Snow Labs +name: splade_somemath_cocomae_220 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`splade_somemath_cocomae_220` is a English model originally trained by approach0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/splade_somemath_cocomae_220_en_5.1.1_3.0_1694634379900.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/splade_somemath_cocomae_220_en_5.1.1_3.0_1694634379900.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("splade_somemath_cocomae_220","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("splade_somemath_cocomae_220", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|splade_somemath_cocomae_220| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/approach0/splade_somemath-cocomae-220 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-tamil_bert_ta.md b/docs/_posts/ahmedlone127/2023-09-13-tamil_bert_ta.md new file mode 100644 index 00000000000000..8cb5121ac903b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-tamil_bert_ta.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Tamil tamil_bert BertEmbeddings from l3cube-pune +author: John Snow Labs +name: tamil_bert +date: 2023-09-13 +tags: [bert, ta, open_source, fill_mask, onnx] +task: Embeddings +language: ta +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tamil_bert` is a Tamil model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tamil_bert_ta_5.1.1_3.0_1694641183579.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tamil_bert_ta_5.1.1_3.0_1694641183579.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("tamil_bert","ta") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("tamil_bert", "ta") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tamil_bert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|ta| +|Size:|890.7 MB| + +## References + +https://huggingface.co/l3cube-pune/tamil-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-telugu_bert_te.md b/docs/_posts/ahmedlone127/2023-09-13-telugu_bert_te.md new file mode 100644 index 00000000000000..c79075d944c5d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-telugu_bert_te.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Telugu telugu_bert BertEmbeddings from l3cube-pune +author: John Snow Labs +name: telugu_bert +date: 2023-09-13 +tags: [bert, te, open_source, fill_mask, onnx] +task: Embeddings +language: te +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`telugu_bert` is a Telugu model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/telugu_bert_te_5.1.1_3.0_1694639562892.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/telugu_bert_te_5.1.1_3.0_1694639562892.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("telugu_bert","te") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("telugu_bert", "te") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|telugu_bert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|te| +|Size:|890.5 MB| + +## References + +https://huggingface.co/l3cube-pune/telugu-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-test_model_allispaul_en.md b/docs/_posts/ahmedlone127/2023-09-13-test_model_allispaul_en.md new file mode 100644 index 00000000000000..f76ff5249edd24 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-test_model_allispaul_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English test_model_allispaul BertEmbeddings from allispaul +author: John Snow Labs +name: test_model_allispaul +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_model_allispaul` is a English model originally trained by allispaul. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_model_allispaul_en_5.1.1_3.0_1694625763081.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_model_allispaul_en_5.1.1_3.0_1694625763081.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("test_model_allispaul","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("test_model_allispaul", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_model_allispaul| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/allispaul/test-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-testc8_1_en.md b/docs/_posts/ahmedlone127/2023-09-13-testc8_1_en.md new file mode 100644 index 00000000000000..8f4b77e91055a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-testc8_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English testc8_1 BertEmbeddings from shafin +author: John Snow Labs +name: testc8_1 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`testc8_1` is a English model originally trained by shafin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/testc8_1_en_5.1.1_3.0_1694648911715.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/testc8_1_en_5.1.1_3.0_1694648911715.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("testc8_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("testc8_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|testc8_1| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.1 MB| + +## References + +https://huggingface.co/shafin/testc8-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-testc8_2_en.md b/docs/_posts/ahmedlone127/2023-09-13-testc8_2_en.md new file mode 100644 index 00000000000000..b3e819239f89cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-testc8_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English testc8_2 BertEmbeddings from shafin +author: John Snow Labs +name: testc8_2 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`testc8_2` is a English model originally trained by shafin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/testc8_2_en_5.1.1_3.0_1694649179749.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/testc8_2_en_5.1.1_3.0_1694649179749.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("testc8_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("testc8_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|testc8_2| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.1 MB| + +## References + +https://huggingface.co/shafin/testc8-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-tibert_base_ti.md b/docs/_posts/ahmedlone127/2023-09-13-tibert_base_ti.md new file mode 100644 index 00000000000000..a3dc47f3afca22 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-tibert_base_ti.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Tigrinya tibert_base BertEmbeddings from fgaim +author: John Snow Labs +name: tibert_base +date: 2023-09-13 +tags: [bert, ti, open_source, fill_mask, onnx] +task: Embeddings +language: ti +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tibert_base` is a Tigrinya model originally trained by fgaim. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tibert_base_ti_5.1.1_3.0_1694637403635.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tibert_base_ti_5.1.1_3.0_1694637403635.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("tibert_base","ti") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("tibert_base", "ti") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tibert_base| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|ti| +|Size:|407.8 MB| + +## References + +https://huggingface.co/fgaim/tibert-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-tinybert_general_4l_312d_german_de.md b/docs/_posts/ahmedlone127/2023-09-13-tinybert_general_4l_312d_german_de.md new file mode 100644 index 00000000000000..2deadf696e9db2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-tinybert_general_4l_312d_german_de.md @@ -0,0 +1,93 @@ +--- +layout: model +title: German tinybert_general_4l_312d_german BertEmbeddings from dvm1983 +author: John Snow Labs +name: tinybert_general_4l_312d_german +date: 2023-09-13 +tags: [bert, de, open_source, fill_mask, onnx] +task: Embeddings +language: de +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tinybert_general_4l_312d_german` is a German model originally trained by dvm1983. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tinybert_general_4l_312d_german_de_5.1.1_3.0_1694629239176.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tinybert_general_4l_312d_german_de_5.1.1_3.0_1694629239176.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("tinybert_general_4l_312d_german","de") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("tinybert_general_4l_312d_german", "de") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tinybert_general_4l_312d_german| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|de| +|Size:|54.5 MB| + +## References + +https://huggingface.co/dvm1983/TinyBERT_General_4L_312D_de \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-tinybert_general_6l_768d_finetuned_wikitext103_en.md b/docs/_posts/ahmedlone127/2023-09-13-tinybert_general_6l_768d_finetuned_wikitext103_en.md new file mode 100644 index 00000000000000..8fe297ac5f96af --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-tinybert_general_6l_768d_finetuned_wikitext103_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tinybert_general_6l_768d_finetuned_wikitext103 BertEmbeddings from saghar +author: John Snow Labs +name: tinybert_general_6l_768d_finetuned_wikitext103 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tinybert_general_6l_768d_finetuned_wikitext103` is a English model originally trained by saghar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tinybert_general_6l_768d_finetuned_wikitext103_en_5.1.1_3.0_1694614398325.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tinybert_general_6l_768d_finetuned_wikitext103_en_5.1.1_3.0_1694614398325.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("tinybert_general_6l_768d_finetuned_wikitext103","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("tinybert_general_6l_768d_finetuned_wikitext103", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tinybert_general_6l_768d_finetuned_wikitext103| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|248.9 MB| + +## References + +https://huggingface.co/saghar/TinyBERT_General_6L_768D-finetuned-wikitext103 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-tinybert_l_4_h_312_v2_finetuned_wikitext103_en.md b/docs/_posts/ahmedlone127/2023-09-13-tinybert_l_4_h_312_v2_finetuned_wikitext103_en.md new file mode 100644 index 00000000000000..617821d05bb8d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-tinybert_l_4_h_312_v2_finetuned_wikitext103_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tinybert_l_4_h_312_v2_finetuned_wikitext103 BertEmbeddings from saghar +author: John Snow Labs +name: tinybert_l_4_h_312_v2_finetuned_wikitext103 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tinybert_l_4_h_312_v2_finetuned_wikitext103` is a English model originally trained by saghar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tinybert_l_4_h_312_v2_finetuned_wikitext103_en_5.1.1_3.0_1694615065926.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tinybert_l_4_h_312_v2_finetuned_wikitext103_en_5.1.1_3.0_1694615065926.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("tinybert_l_4_h_312_v2_finetuned_wikitext103","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("tinybert_l_4_h_312_v2_finetuned_wikitext103", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tinybert_l_4_h_312_v2_finetuned_wikitext103| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|53.8 MB| + +## References + +https://huggingface.co/saghar/TinyBERT_L-4_H-312_v2-finetuned-wikitext103 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-tlm_amazon_large_scale_en.md b/docs/_posts/ahmedlone127/2023-09-13-tlm_amazon_large_scale_en.md new file mode 100644 index 00000000000000..b9c91319cbb7c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-tlm_amazon_large_scale_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tlm_amazon_large_scale BertEmbeddings from yxchar +author: John Snow Labs +name: tlm_amazon_large_scale +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tlm_amazon_large_scale` is a English model originally trained by yxchar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tlm_amazon_large_scale_en_5.1.1_3.0_1694588840311.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tlm_amazon_large_scale_en_5.1.1_3.0_1694588840311.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("tlm_amazon_large_scale","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("tlm_amazon_large_scale", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tlm_amazon_large_scale| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/yxchar/tlm-amazon-large-scale \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-tlm_hyp_medium_scale_en.md b/docs/_posts/ahmedlone127/2023-09-13-tlm_hyp_medium_scale_en.md new file mode 100644 index 00000000000000..7cfb252c763d5c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-tlm_hyp_medium_scale_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tlm_hyp_medium_scale BertEmbeddings from yxchar +author: John Snow Labs +name: tlm_hyp_medium_scale +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tlm_hyp_medium_scale` is a English model originally trained by yxchar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tlm_hyp_medium_scale_en_5.1.1_3.0_1694591052250.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tlm_hyp_medium_scale_en_5.1.1_3.0_1694591052250.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("tlm_hyp_medium_scale","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("tlm_hyp_medium_scale", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tlm_hyp_medium_scale| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|408.4 MB| + +## References + +https://huggingface.co/yxchar/tlm-hyp-medium-scale \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-turkish_base_bert_uncased_tr.md b/docs/_posts/ahmedlone127/2023-09-13-turkish_base_bert_uncased_tr.md new file mode 100644 index 00000000000000..a61127112b0517 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-turkish_base_bert_uncased_tr.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Turkish turkish_base_bert_uncased BertEmbeddings from ytu-ce-cosmos +author: John Snow Labs +name: turkish_base_bert_uncased +date: 2023-09-13 +tags: [bert, tr, open_source, fill_mask, onnx] +task: Embeddings +language: tr +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`turkish_base_bert_uncased` is a Turkish model originally trained by ytu-ce-cosmos. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/turkish_base_bert_uncased_tr_5.1.1_3.0_1694631273482.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/turkish_base_bert_uncased_tr_5.1.1_3.0_1694631273482.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("turkish_base_bert_uncased","tr") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("turkish_base_bert_uncased", "tr") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|turkish_base_bert_uncased| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|tr| +|Size:|413.0 MB| + +## References + +https://huggingface.co/ytu-ce-cosmos/turkish-base-bert-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-turkish_medium_bert_uncased_tr.md b/docs/_posts/ahmedlone127/2023-09-13-turkish_medium_bert_uncased_tr.md new file mode 100644 index 00000000000000..f2fc72a6a7920a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-turkish_medium_bert_uncased_tr.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Turkish turkish_medium_bert_uncased BertEmbeddings from ytu-ce-cosmos +author: John Snow Labs +name: turkish_medium_bert_uncased +date: 2023-09-13 +tags: [bert, tr, open_source, fill_mask, onnx] +task: Embeddings +language: tr +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`turkish_medium_bert_uncased` is a Turkish model originally trained by ytu-ce-cosmos. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/turkish_medium_bert_uncased_tr_5.1.1_3.0_1694629908740.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/turkish_medium_bert_uncased_tr_5.1.1_3.0_1694629908740.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("turkish_medium_bert_uncased","tr") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("turkish_medium_bert_uncased", "tr") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|turkish_medium_bert_uncased| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|tr| +|Size:|157.4 MB| + +## References + +https://huggingface.co/ytu-ce-cosmos/turkish-medium-bert-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-turkish_mini_bert_uncased_tr.md b/docs/_posts/ahmedlone127/2023-09-13-turkish_mini_bert_uncased_tr.md new file mode 100644 index 00000000000000..38b0dace2beb3b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-turkish_mini_bert_uncased_tr.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Turkish turkish_mini_bert_uncased BertEmbeddings from ytu-ce-cosmos +author: John Snow Labs +name: turkish_mini_bert_uncased +date: 2023-09-13 +tags: [bert, tr, open_source, fill_mask, onnx] +task: Embeddings +language: tr +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`turkish_mini_bert_uncased` is a Turkish model originally trained by ytu-ce-cosmos. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/turkish_mini_bert_uncased_tr_5.1.1_3.0_1694629425088.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/turkish_mini_bert_uncased_tr_5.1.1_3.0_1694629425088.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("turkish_mini_bert_uncased","tr") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("turkish_mini_bert_uncased", "tr") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|turkish_mini_bert_uncased| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|tr| +|Size:|43.3 MB| + +## References + +https://huggingface.co/ytu-ce-cosmos/turkish-mini-bert-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-turkish_small_bert_uncased_tr.md b/docs/_posts/ahmedlone127/2023-09-13-turkish_small_bert_uncased_tr.md new file mode 100644 index 00000000000000..7e662c5a7b01ae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-turkish_small_bert_uncased_tr.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Turkish turkish_small_bert_uncased BertEmbeddings from ytu-ce-cosmos +author: John Snow Labs +name: turkish_small_bert_uncased +date: 2023-09-13 +tags: [bert, tr, open_source, fill_mask, onnx] +task: Embeddings +language: tr +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`turkish_small_bert_uncased` is a Turkish model originally trained by ytu-ce-cosmos. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/turkish_small_bert_uncased_tr_5.1.1_3.0_1694629693791.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/turkish_small_bert_uncased_tr_5.1.1_3.0_1694629693791.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("turkish_small_bert_uncased","tr") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("turkish_small_bert_uncased", "tr") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|turkish_small_bert_uncased| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|tr| +|Size:|109.9 MB| + +## References + +https://huggingface.co/ytu-ce-cosmos/turkish-small-bert-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-turkish_tiny_bert_uncased_tr.md b/docs/_posts/ahmedlone127/2023-09-13-turkish_tiny_bert_uncased_tr.md new file mode 100644 index 00000000000000..8b1f6a2abe9735 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-turkish_tiny_bert_uncased_tr.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Turkish turkish_tiny_bert_uncased BertEmbeddings from ytu-ce-cosmos +author: John Snow Labs +name: turkish_tiny_bert_uncased +date: 2023-09-13 +tags: [bert, tr, open_source, fill_mask, onnx] +task: Embeddings +language: tr +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`turkish_tiny_bert_uncased` is a Turkish model originally trained by ytu-ce-cosmos. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/turkish_tiny_bert_uncased_tr_5.1.1_3.0_1694629276501.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/turkish_tiny_bert_uncased_tr_5.1.1_3.0_1694629276501.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("turkish_tiny_bert_uncased","tr") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("turkish_tiny_bert_uncased", "tr") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|turkish_tiny_bert_uncased| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|tr| +|Size:|17.4 MB| + +## References + +https://huggingface.co/ytu-ce-cosmos/turkish-tiny-bert-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-writter_bert_hep_en.md b/docs/_posts/ahmedlone127/2023-09-13-writter_bert_hep_en.md new file mode 100644 index 00000000000000..a2c65af1607008 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-writter_bert_hep_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English writter_bert_hep BertEmbeddings from munozariasjm +author: John Snow Labs +name: writter_bert_hep +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`writter_bert_hep` is a English model originally trained by munozariasjm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/writter_bert_hep_en_5.1.1_3.0_1694626233603.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/writter_bert_hep_en_5.1.1_3.0_1694626233603.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("writter_bert_hep","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("writter_bert_hep", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|writter_bert_hep| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/munozariasjm/writter_bert_hep \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-xtremedistil_l12_h384_uncased_finetuned_wikitext103_en.md b/docs/_posts/ahmedlone127/2023-09-13-xtremedistil_l12_h384_uncased_finetuned_wikitext103_en.md new file mode 100644 index 00000000000000..966362a70b6660 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-xtremedistil_l12_h384_uncased_finetuned_wikitext103_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English xtremedistil_l12_h384_uncased_finetuned_wikitext103 BertEmbeddings from saghar +author: John Snow Labs +name: xtremedistil_l12_h384_uncased_finetuned_wikitext103 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xtremedistil_l12_h384_uncased_finetuned_wikitext103` is a English model originally trained by saghar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xtremedistil_l12_h384_uncased_finetuned_wikitext103_en_5.1.1_3.0_1694616486581.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xtremedistil_l12_h384_uncased_finetuned_wikitext103_en_5.1.1_3.0_1694616486581.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("xtremedistil_l12_h384_uncased_finetuned_wikitext103","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("xtremedistil_l12_h384_uncased_finetuned_wikitext103", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xtremedistil_l12_h384_uncased_finetuned_wikitext103| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|123.6 MB| + +## References + +https://huggingface.co/saghar/xtremedistil-l12-h384-uncased-finetuned-wikitext103 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-xtremedistil_l6_h384_uncased_finetuned_wikitext103_en.md b/docs/_posts/ahmedlone127/2023-09-13-xtremedistil_l6_h384_uncased_finetuned_wikitext103_en.md new file mode 100644 index 00000000000000..478ff4678dc65a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-xtremedistil_l6_h384_uncased_finetuned_wikitext103_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English xtremedistil_l6_h384_uncased_finetuned_wikitext103 BertEmbeddings from saghar +author: John Snow Labs +name: xtremedistil_l6_h384_uncased_finetuned_wikitext103 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xtremedistil_l6_h384_uncased_finetuned_wikitext103` is a English model originally trained by saghar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xtremedistil_l6_h384_uncased_finetuned_wikitext103_en_5.1.1_3.0_1694616303708.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xtremedistil_l6_h384_uncased_finetuned_wikitext103_en_5.1.1_3.0_1694616303708.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("xtremedistil_l6_h384_uncased_finetuned_wikitext103","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("xtremedistil_l6_h384_uncased_finetuned_wikitext103", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xtremedistil_l6_h384_uncased_finetuned_wikitext103| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|83.6 MB| + +## References + +https://huggingface.co/saghar/xtremedistil-l6-h384-uncased-finetuned-wikitext103 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-yelpy_bert_en.md b/docs/_posts/ahmedlone127/2023-09-13-yelpy_bert_en.md new file mode 100644 index 00000000000000..2283ab40fddadc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-yelpy_bert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English yelpy_bert BertEmbeddings from rttl-ai +author: John Snow Labs +name: yelpy_bert +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`yelpy_bert` is a English model originally trained by rttl-ai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/yelpy_bert_en_5.1.1_3.0_1694583064434.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/yelpy_bert_en_5.1.1_3.0_1694583064434.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("yelpy_bert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("yelpy_bert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|yelpy_bert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.0 MB| + +## References + +https://huggingface.co/rttl-ai/yelpy-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-youtube_bert_10_en.md b/docs/_posts/ahmedlone127/2023-09-13-youtube_bert_10_en.md new file mode 100644 index 00000000000000..50a91796587ee1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-youtube_bert_10_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English youtube_bert_10 BertEmbeddings from flboehm +author: John Snow Labs +name: youtube_bert_10 +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`youtube_bert_10` is a English model originally trained by flboehm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/youtube_bert_10_en_5.1.1_3.0_1694646466454.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/youtube_bert_10_en_5.1.1_3.0_1694646466454.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("youtube_bert_10","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("youtube_bert_10", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|youtube_bert_10| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.6 MB| + +## References + +https://huggingface.co/flboehm/youtube-bert_10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-13-youtube_bert_en.md b/docs/_posts/ahmedlone127/2023-09-13-youtube_bert_en.md new file mode 100644 index 00000000000000..ffd52bceb7b195 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-13-youtube_bert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English youtube_bert BertEmbeddings from flboehm +author: John Snow Labs +name: youtube_bert +date: 2023-09-13 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`youtube_bert` is a English model originally trained by flboehm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/youtube_bert_en_5.1.1_3.0_1694646017109.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/youtube_bert_en_5.1.1_3.0_1694646017109.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("youtube_bert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("youtube_bert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|youtube_bert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.6 MB| + +## References + +https://huggingface.co/flboehm/youtube-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-9.4aistudy_en.md b/docs/_posts/ahmedlone127/2023-09-14-9.4aistudy_en.md new file mode 100644 index 00000000000000..a4c07e832d1b47 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-9.4aistudy_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English 9.4aistudy BertEmbeddings from hangmu +author: John Snow Labs +name: 9.4aistudy +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`9.4aistudy` is a English model originally trained by hangmu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/9.4aistudy_en_5.1.1_3.0_1694652490572.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/9.4aistudy_en_5.1.1_3.0_1694652490572.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("9.4aistudy","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("9.4aistudy", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|9.4aistudy| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/hangmu/9.4AIstudy \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-akkbert_en.md b/docs/_posts/ahmedlone127/2023-09-14-akkbert_en.md new file mode 100644 index 00000000000000..11ae896949b576 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-akkbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English akkbert BertEmbeddings from megamattc +author: John Snow Labs +name: akkbert +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`akkbert` is a English model originally trained by megamattc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/akkbert_en_5.1.1_3.0_1694663523335.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/akkbert_en_5.1.1_3.0_1694663523335.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("akkbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("akkbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|akkbert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|664.9 MB| + +## References + +https://huggingface.co/megamattc/AkkBert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-alephbertgimmel_base_512_he.md b/docs/_posts/ahmedlone127/2023-09-14-alephbertgimmel_base_512_he.md new file mode 100644 index 00000000000000..591248169da26e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-alephbertgimmel_base_512_he.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Hebrew alephbertgimmel_base_512 BertEmbeddings from imvladikon +author: John Snow Labs +name: alephbertgimmel_base_512 +date: 2023-09-14 +tags: [bert, he, open_source, fill_mask, onnx] +task: Embeddings +language: he +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`alephbertgimmel_base_512` is a Hebrew model originally trained by imvladikon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/alephbertgimmel_base_512_he_5.1.1_3.0_1694658101075.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/alephbertgimmel_base_512_he_5.1.1_3.0_1694658101075.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("alephbertgimmel_base_512","he") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("alephbertgimmel_base_512", "he") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|alephbertgimmel_base_512| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|he| +|Size:|690.4 MB| + +## References + +https://huggingface.co/imvladikon/alephbertgimmel-base-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-aligner_english_vietnamese_en.md b/docs/_posts/ahmedlone127/2023-09-14-aligner_english_vietnamese_en.md new file mode 100644 index 00000000000000..1226fbe15e8e20 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-aligner_english_vietnamese_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English aligner_english_vietnamese BertEmbeddings from hdmt +author: John Snow Labs +name: aligner_english_vietnamese +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`aligner_english_vietnamese` is a English model originally trained by hdmt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/aligner_english_vietnamese_en_5.1.1_3.0_1694655583036.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/aligner_english_vietnamese_en_5.1.1_3.0_1694655583036.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("aligner_english_vietnamese","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("aligner_english_vietnamese", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|aligner_english_vietnamese| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.0 MB| + +## References + +https://huggingface.co/hdmt/aligner-en-vi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-antismetisim1_finetuned_mlm_en.md b/docs/_posts/ahmedlone127/2023-09-14-antismetisim1_finetuned_mlm_en.md new file mode 100644 index 00000000000000..8a81e08a25e7ab --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-antismetisim1_finetuned_mlm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English antismetisim1_finetuned_mlm BertEmbeddings from Dhanush66 +author: John Snow Labs +name: antismetisim1_finetuned_mlm +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`antismetisim1_finetuned_mlm` is a English model originally trained by Dhanush66. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/antismetisim1_finetuned_mlm_en_5.1.1_3.0_1694667541883.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/antismetisim1_finetuned_mlm_en_5.1.1_3.0_1694667541883.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("antismetisim1_finetuned_mlm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("antismetisim1_finetuned_mlm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|antismetisim1_finetuned_mlm| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/Dhanush66/Antismetisim1-finetuned-MLM \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-antismetisimlargedata_finetuned_mlm_en.md b/docs/_posts/ahmedlone127/2023-09-14-antismetisimlargedata_finetuned_mlm_en.md new file mode 100644 index 00000000000000..3283bf335f3c73 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-antismetisimlargedata_finetuned_mlm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English antismetisimlargedata_finetuned_mlm BertEmbeddings from Dhanush66 +author: John Snow Labs +name: antismetisimlargedata_finetuned_mlm +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`antismetisimlargedata_finetuned_mlm` is a English model originally trained by Dhanush66. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/antismetisimlargedata_finetuned_mlm_en_5.1.1_3.0_1694669960122.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/antismetisimlargedata_finetuned_mlm_en_5.1.1_3.0_1694669960122.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("antismetisimlargedata_finetuned_mlm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("antismetisimlargedata_finetuned_mlm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|antismetisimlargedata_finetuned_mlm| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|475.5 MB| + +## References + +https://huggingface.co/Dhanush66/AntismetisimLargedata-finetuned-MLM \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-applicationbert_en.md b/docs/_posts/ahmedlone127/2023-09-14-applicationbert_en.md new file mode 100644 index 00000000000000..400887f6307b1e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-applicationbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English applicationbert BertEmbeddings from EgilKarlsen +author: John Snow Labs +name: applicationbert +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`applicationbert` is a English model originally trained by EgilKarlsen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/applicationbert_en_5.1.1_3.0_1694664670828.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/applicationbert_en_5.1.1_3.0_1694664670828.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("applicationbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("applicationbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|applicationbert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/EgilKarlsen/ApplicationBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-archaeobert_en.md b/docs/_posts/ahmedlone127/2023-09-14-archaeobert_en.md new file mode 100644 index 00000000000000..cd24212f4ff18d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-archaeobert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English archaeobert BertEmbeddings from alexbrandsen +author: John Snow Labs +name: archaeobert +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`archaeobert` is a English model originally trained by alexbrandsen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/archaeobert_en_5.1.1_3.0_1694650857417.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/archaeobert_en_5.1.1_3.0_1694650857417.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("archaeobert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("archaeobert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|archaeobert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.5 MB| + +## References + +https://huggingface.co/alexbrandsen/ArchaeoBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-autotrain_acc_keys_2347073860_en.md b/docs/_posts/ahmedlone127/2023-09-14-autotrain_acc_keys_2347073860_en.md new file mode 100644 index 00000000000000..6230f96852d273 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-autotrain_acc_keys_2347073860_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English autotrain_acc_keys_2347073860 BertEmbeddings from alanila +author: John Snow Labs +name: autotrain_acc_keys_2347073860 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_acc_keys_2347073860` is a English model originally trained by alanila. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_acc_keys_2347073860_en_5.1.1_3.0_1694660872048.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_acc_keys_2347073860_en_5.1.1_3.0_1694660872048.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("autotrain_acc_keys_2347073860","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("autotrain_acc_keys_2347073860", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_acc_keys_2347073860| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/alanila/autotrain-acc_keys-2347073860 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-awesome_align_with_corsican_xx.md b/docs/_posts/ahmedlone127/2023-09-14-awesome_align_with_corsican_xx.md new file mode 100644 index 00000000000000..2a88580dea8082 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-awesome_align_with_corsican_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual awesome_align_with_corsican BertEmbeddings from aneuraz +author: John Snow Labs +name: awesome_align_with_corsican +date: 2023-09-14 +tags: [bert, xx, open_source, fill_mask, onnx] +task: Embeddings +language: xx +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`awesome_align_with_corsican` is a Multilingual model originally trained by aneuraz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/awesome_align_with_corsican_xx_5.1.1_3.0_1694653050397.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/awesome_align_with_corsican_xx_5.1.1_3.0_1694653050397.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("awesome_align_with_corsican","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("awesome_align_with_corsican", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|awesome_align_with_corsican| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|xx| +|Size:|665.0 MB| + +## References + +https://huggingface.co/aneuraz/awesome-align-with-co \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-base_mlm_tweet_en.md b/docs/_posts/ahmedlone127/2023-09-14-base_mlm_tweet_en.md new file mode 100644 index 00000000000000..9196ab59237cd7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-base_mlm_tweet_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English base_mlm_tweet BertEmbeddings from muhtasham +author: John Snow Labs +name: base_mlm_tweet +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`base_mlm_tweet` is a English model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/base_mlm_tweet_en_5.1.1_3.0_1694664760038.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/base_mlm_tweet_en_5.1.1_3.0_1694664760038.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("base_mlm_tweet","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("base_mlm_tweet", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|base_mlm_tweet| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.6 MB| + +## References + +https://huggingface.co/muhtasham/base-mlm-tweet \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_application_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_application_en.md new file mode 100644 index 00000000000000..47bc2e814752ef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_application_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_application BertEmbeddings from EgilKarlsen +author: John Snow Labs +name: bert_application +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_application` is a English model originally trained by EgilKarlsen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_application_en_5.1.1_3.0_1694656899100.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_application_en_5.1.1_3.0_1694656899100.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_application","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_application", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_application| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/EgilKarlsen/BERT-Application \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_bangla_finetuned_summarization_dataset_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_bangla_finetuned_summarization_dataset_en.md new file mode 100644 index 00000000000000..0af6e45a148065 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_bangla_finetuned_summarization_dataset_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_bangla_finetuned_summarization_dataset BertEmbeddings from arbitropy +author: John Snow Labs +name: bert_base_bangla_finetuned_summarization_dataset +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_bangla_finetuned_summarization_dataset` is a English model originally trained by arbitropy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_bangla_finetuned_summarization_dataset_en_5.1.1_3.0_1694666948637.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_bangla_finetuned_summarization_dataset_en_5.1.1_3.0_1694666948637.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_bangla_finetuned_summarization_dataset","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_bangla_finetuned_summarization_dataset", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_bangla_finetuned_summarization_dataset| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/arbitropy/bert-base-bangla-finetuned-summarization-dataset \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_bookcorpus_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_bookcorpus_en.md new file mode 100644 index 00000000000000..3f0d44679ba69c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_bookcorpus_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_bookcorpus BertEmbeddings from nicholasKluge +author: John Snow Labs +name: bert_base_bookcorpus +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_bookcorpus` is a English model originally trained by nicholasKluge. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_bookcorpus_en_5.1.1_3.0_1694660865879.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_bookcorpus_en_5.1.1_3.0_1694660865879.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_bookcorpus","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_bookcorpus", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_bookcorpus| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|406.3 MB| + +## References + +https://huggingface.co/nicholasKluge/bert-base-bookcorpus \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_conversational_finetuned_wallisian_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_conversational_finetuned_wallisian_en.md new file mode 100644 index 00000000000000..c7bc69f37d0bc3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_conversational_finetuned_wallisian_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_conversational_finetuned_wallisian BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_cased_conversational_finetuned_wallisian +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_conversational_finetuned_wallisian` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_conversational_finetuned_wallisian_en_5.1.1_3.0_1694652098523.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_conversational_finetuned_wallisian_en_5.1.1_3.0_1694652098523.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_conversational_finetuned_wallisian","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_conversational_finetuned_wallisian", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_conversational_finetuned_wallisian| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|402.8 MB| + +## References + +https://huggingface.co/btamm12/bert-base-cased-conversational-finetuned-wls \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_chemistry_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_chemistry_en.md new file mode 100644 index 00000000000000..06b0d609693eb3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_chemistry_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_finetuned_chemistry BertEmbeddings from Kuaaangwen +author: John Snow Labs +name: bert_base_cased_finetuned_chemistry +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_chemistry` is a English model originally trained by Kuaaangwen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_chemistry_en_5.1.1_3.0_1694662575342.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_chemistry_en_5.1.1_3.0_1694662575342.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_finetuned_chemistry","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_finetuned_chemistry", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_chemistry| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/Kuaaangwen/bert-base-cased-finetuned-chemistry \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_manual_1ep_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_manual_1ep_en.md new file mode 100644 index 00000000000000..d9329758bee6d7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_manual_1ep_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_finetuned_wallisian_manual_1ep BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_cased_finetuned_wallisian_manual_1ep +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_wallisian_manual_1ep` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_manual_1ep_en_5.1.1_3.0_1694671264388.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_manual_1ep_en_5.1.1_3.0_1694671264388.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_finetuned_wallisian_manual_1ep","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_finetuned_wallisian_manual_1ep", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_wallisian_manual_1ep| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/btamm12/bert-base-cased-finetuned-wls-manual-1ep \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_manual_2ep_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_manual_2ep_en.md new file mode 100644 index 00000000000000..76cab26440f341 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_manual_2ep_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_finetuned_wallisian_manual_2ep BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_cased_finetuned_wallisian_manual_2ep +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_wallisian_manual_2ep` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_manual_2ep_en_5.1.1_3.0_1694671483581.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_manual_2ep_en_5.1.1_3.0_1694671483581.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_finetuned_wallisian_manual_2ep","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_finetuned_wallisian_manual_2ep", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_wallisian_manual_2ep| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/btamm12/bert-base-cased-finetuned-wls-manual-2ep \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_manual_3ep_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_manual_3ep_en.md new file mode 100644 index 00000000000000..c2df8cab1a6d4a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_manual_3ep_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_finetuned_wallisian_manual_3ep BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_cased_finetuned_wallisian_manual_3ep +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_wallisian_manual_3ep` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_manual_3ep_en_5.1.1_3.0_1694671699838.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_manual_3ep_en_5.1.1_3.0_1694671699838.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_finetuned_wallisian_manual_3ep","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_finetuned_wallisian_manual_3ep", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_wallisian_manual_3ep| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/btamm12/bert-base-cased-finetuned-wls-manual-3ep \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_manual_4ep_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_manual_4ep_en.md new file mode 100644 index 00000000000000..31140ef3ac2ca7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_manual_4ep_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_finetuned_wallisian_manual_4ep BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_cased_finetuned_wallisian_manual_4ep +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_wallisian_manual_4ep` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_manual_4ep_en_5.1.1_3.0_1694671916586.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_manual_4ep_en_5.1.1_3.0_1694671916586.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_finetuned_wallisian_manual_4ep","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_finetuned_wallisian_manual_4ep", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_wallisian_manual_4ep| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/btamm12/bert-base-cased-finetuned-wls-manual-4ep \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_manual_5ep_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_manual_5ep_en.md new file mode 100644 index 00000000000000..8b557a038a1c52 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_manual_5ep_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_finetuned_wallisian_manual_5ep BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_cased_finetuned_wallisian_manual_5ep +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_wallisian_manual_5ep` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_manual_5ep_en_5.1.1_3.0_1694672136887.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_manual_5ep_en_5.1.1_3.0_1694672136887.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_finetuned_wallisian_manual_5ep","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_finetuned_wallisian_manual_5ep", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_wallisian_manual_5ep| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/btamm12/bert-base-cased-finetuned-wls-manual-5ep \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_manual_6ep_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_manual_6ep_en.md new file mode 100644 index 00000000000000..5c362febe31be4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_manual_6ep_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_finetuned_wallisian_manual_6ep BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_cased_finetuned_wallisian_manual_6ep +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_wallisian_manual_6ep` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_manual_6ep_en_5.1.1_3.0_1694672358346.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_manual_6ep_en_5.1.1_3.0_1694672358346.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_finetuned_wallisian_manual_6ep","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_finetuned_wallisian_manual_6ep", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_wallisian_manual_6ep| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/btamm12/bert-base-cased-finetuned-wls-manual-6ep \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_manual_7ep_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_manual_7ep_en.md new file mode 100644 index 00000000000000..a1c6991c44f6cd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_manual_7ep_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_finetuned_wallisian_manual_7ep BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_cased_finetuned_wallisian_manual_7ep +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_wallisian_manual_7ep` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_manual_7ep_en_5.1.1_3.0_1694672574132.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_manual_7ep_en_5.1.1_3.0_1694672574132.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_finetuned_wallisian_manual_7ep","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_finetuned_wallisian_manual_7ep", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_wallisian_manual_7ep| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/btamm12/bert-base-cased-finetuned-wls-manual-7ep \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_manual_8ep_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_manual_8ep_en.md new file mode 100644 index 00000000000000..6023d76736cc6a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_manual_8ep_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_finetuned_wallisian_manual_8ep BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_cased_finetuned_wallisian_manual_8ep +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_wallisian_manual_8ep` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_manual_8ep_en_5.1.1_3.0_1694672793233.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_manual_8ep_en_5.1.1_3.0_1694672793233.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_finetuned_wallisian_manual_8ep","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_finetuned_wallisian_manual_8ep", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_wallisian_manual_8ep| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/btamm12/bert-base-cased-finetuned-wls-manual-8ep \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_10ep_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_10ep_en.md new file mode 100644 index 00000000000000..8907e33aa75460 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_10ep_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_finetuned_wallisian_whisper_10ep BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_cased_finetuned_wallisian_whisper_10ep +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_wallisian_whisper_10ep` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_whisper_10ep_en_5.1.1_3.0_1694671157614.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_whisper_10ep_en_5.1.1_3.0_1694671157614.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_finetuned_wallisian_whisper_10ep","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_finetuned_wallisian_whisper_10ep", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_wallisian_whisper_10ep| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/btamm12/bert-base-cased-finetuned-wls-whisper-10ep \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_1ep_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_1ep_en.md new file mode 100644 index 00000000000000..586ef7cf3abcb5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_1ep_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_finetuned_wallisian_whisper_1ep BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_cased_finetuned_wallisian_whisper_1ep +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_wallisian_whisper_1ep` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_whisper_1ep_en_5.1.1_3.0_1694670183935.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_whisper_1ep_en_5.1.1_3.0_1694670183935.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_finetuned_wallisian_whisper_1ep","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_finetuned_wallisian_whisper_1ep", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_wallisian_whisper_1ep| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/btamm12/bert-base-cased-finetuned-wls-whisper-1ep \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_2ep_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_2ep_en.md new file mode 100644 index 00000000000000..9b838a1b0d25b5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_2ep_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_finetuned_wallisian_whisper_2ep BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_cased_finetuned_wallisian_whisper_2ep +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_wallisian_whisper_2ep` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_whisper_2ep_en_5.1.1_3.0_1694670290798.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_whisper_2ep_en_5.1.1_3.0_1694670290798.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_finetuned_wallisian_whisper_2ep","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_finetuned_wallisian_whisper_2ep", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_wallisian_whisper_2ep| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/btamm12/bert-base-cased-finetuned-wls-whisper-2ep \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_3ep_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_3ep_en.md new file mode 100644 index 00000000000000..ce81512a20d135 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_3ep_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_finetuned_wallisian_whisper_3ep BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_cased_finetuned_wallisian_whisper_3ep +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_wallisian_whisper_3ep` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_whisper_3ep_en_5.1.1_3.0_1694670399822.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_whisper_3ep_en_5.1.1_3.0_1694670399822.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_finetuned_wallisian_whisper_3ep","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_finetuned_wallisian_whisper_3ep", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_wallisian_whisper_3ep| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/btamm12/bert-base-cased-finetuned-wls-whisper-3ep \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_4ep_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_4ep_en.md new file mode 100644 index 00000000000000..ca7e21df625eee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_4ep_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_finetuned_wallisian_whisper_4ep BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_cased_finetuned_wallisian_whisper_4ep +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_wallisian_whisper_4ep` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_whisper_4ep_en_5.1.1_3.0_1694670508413.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_whisper_4ep_en_5.1.1_3.0_1694670508413.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_finetuned_wallisian_whisper_4ep","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_finetuned_wallisian_whisper_4ep", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_wallisian_whisper_4ep| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/btamm12/bert-base-cased-finetuned-wls-whisper-4ep \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_5ep_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_5ep_en.md new file mode 100644 index 00000000000000..0a0240cda3e66a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_5ep_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_finetuned_wallisian_whisper_5ep BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_cased_finetuned_wallisian_whisper_5ep +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_wallisian_whisper_5ep` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_whisper_5ep_en_5.1.1_3.0_1694670617445.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_whisper_5ep_en_5.1.1_3.0_1694670617445.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_finetuned_wallisian_whisper_5ep","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_finetuned_wallisian_whisper_5ep", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_wallisian_whisper_5ep| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/btamm12/bert-base-cased-finetuned-wls-whisper-5ep \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_6ep_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_6ep_en.md new file mode 100644 index 00000000000000..84352c6d887e03 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_6ep_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_finetuned_wallisian_whisper_6ep BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_cased_finetuned_wallisian_whisper_6ep +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_wallisian_whisper_6ep` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_whisper_6ep_en_5.1.1_3.0_1694670727672.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_whisper_6ep_en_5.1.1_3.0_1694670727672.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_finetuned_wallisian_whisper_6ep","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_finetuned_wallisian_whisper_6ep", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_wallisian_whisper_6ep| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/btamm12/bert-base-cased-finetuned-wls-whisper-6ep \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_7ep_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_7ep_en.md new file mode 100644 index 00000000000000..6e840d323af79f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_7ep_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_finetuned_wallisian_whisper_7ep BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_cased_finetuned_wallisian_whisper_7ep +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_wallisian_whisper_7ep` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_whisper_7ep_en_5.1.1_3.0_1694670836499.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_whisper_7ep_en_5.1.1_3.0_1694670836499.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_finetuned_wallisian_whisper_7ep","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_finetuned_wallisian_whisper_7ep", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_wallisian_whisper_7ep| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/btamm12/bert-base-cased-finetuned-wls-whisper-7ep \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_8ep_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_8ep_en.md new file mode 100644 index 00000000000000..e8996105cf1042 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_8ep_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_finetuned_wallisian_whisper_8ep BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_cased_finetuned_wallisian_whisper_8ep +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_wallisian_whisper_8ep` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_whisper_8ep_en_5.1.1_3.0_1694670943658.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_whisper_8ep_en_5.1.1_3.0_1694670943658.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_finetuned_wallisian_whisper_8ep","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_finetuned_wallisian_whisper_8ep", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_wallisian_whisper_8ep| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/btamm12/bert-base-cased-finetuned-wls-whisper-8ep \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_9ep_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_9ep_en.md new file mode 100644 index 00000000000000..03359371fcbeb5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_cased_finetuned_wallisian_whisper_9ep_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_finetuned_wallisian_whisper_9ep BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_cased_finetuned_wallisian_whisper_9ep +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_wallisian_whisper_9ep` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_whisper_9ep_en_5.1.1_3.0_1694671051592.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_wallisian_whisper_9ep_en_5.1.1_3.0_1694671051592.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_cased_finetuned_wallisian_whisper_9ep","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_cased_finetuned_wallisian_whisper_9ep", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_wallisian_whisper_9ep| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/btamm12/bert-base-cased-finetuned-wls-whisper-9ep \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_code_comments_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_code_comments_en.md new file mode 100644 index 00000000000000..2eaf964fb5fffd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_code_comments_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_code_comments BertEmbeddings from giganticode +author: John Snow Labs +name: bert_base_code_comments +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_code_comments` is a English model originally trained by giganticode. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_code_comments_en_5.1.1_3.0_1694649864357.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_code_comments_en_5.1.1_3.0_1694649864357.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_code_comments","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_code_comments", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_code_comments| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/giganticode/bert-base-code_comments \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_german_cased_mlm_basque_chemistry_regulation_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_german_cased_mlm_basque_chemistry_regulation_en.md new file mode 100644 index 00000000000000..e95d32ec292b16 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_german_cased_mlm_basque_chemistry_regulation_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_german_cased_mlm_basque_chemistry_regulation BertEmbeddings from jonas-luehrs +author: John Snow Labs +name: bert_base_german_cased_mlm_basque_chemistry_regulation +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_german_cased_mlm_basque_chemistry_regulation` is a English model originally trained by jonas-luehrs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_german_cased_mlm_basque_chemistry_regulation_en_5.1.1_3.0_1694669620686.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_german_cased_mlm_basque_chemistry_regulation_en_5.1.1_3.0_1694669620686.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_german_cased_mlm_basque_chemistry_regulation","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_german_cased_mlm_basque_chemistry_regulation", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_german_cased_mlm_basque_chemistry_regulation| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|406.9 MB| + +## References + +https://huggingface.co/jonas-luehrs/bert-base-german-cased-MLM-eu_chemistry_regulation \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_german_europeana_td_cased_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_german_europeana_td_cased_en.md new file mode 100644 index 00000000000000..c9a951f7840630 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_german_europeana_td_cased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_german_europeana_td_cased BertEmbeddings from dbmdz +author: John Snow Labs +name: bert_base_german_europeana_td_cased +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_german_europeana_td_cased` is a English model originally trained by dbmdz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_german_europeana_td_cased_en_5.1.1_3.0_1694652557402.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_german_europeana_td_cased_en_5.1.1_3.0_1694652557402.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_german_europeana_td_cased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_german_europeana_td_cased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_german_europeana_td_cased| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|412.4 MB| + +## References + +https://huggingface.co/dbmdz/bert-base-german-europeana-td-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_greek_uncased_v5_finetuned_polylex_malagasy_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_greek_uncased_v5_finetuned_polylex_malagasy_en.md new file mode 100644 index 00000000000000..fbc8517052ed2e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_greek_uncased_v5_finetuned_polylex_malagasy_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_greek_uncased_v5_finetuned_polylex_malagasy BertEmbeddings from snousias +author: John Snow Labs +name: bert_base_greek_uncased_v5_finetuned_polylex_malagasy +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_greek_uncased_v5_finetuned_polylex_malagasy` is a English model originally trained by snousias. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_greek_uncased_v5_finetuned_polylex_malagasy_en_5.1.1_3.0_1694649633484.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_greek_uncased_v5_finetuned_polylex_malagasy_en_5.1.1_3.0_1694649633484.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_greek_uncased_v5_finetuned_polylex_malagasy","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_greek_uncased_v5_finetuned_polylex_malagasy", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_greek_uncased_v5_finetuned_polylex_malagasy| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|421.1 MB| + +## References + +https://huggingface.co/snousias/bert-base-greek-uncased-v5-finetuned-polylex-mg \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_greek_uncased_v6_finetuned_polylex_malagasy_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_greek_uncased_v6_finetuned_polylex_malagasy_en.md new file mode 100644 index 00000000000000..2d05099788c1f4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_greek_uncased_v6_finetuned_polylex_malagasy_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_greek_uncased_v6_finetuned_polylex_malagasy BertEmbeddings from polylexmg +author: John Snow Labs +name: bert_base_greek_uncased_v6_finetuned_polylex_malagasy +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_greek_uncased_v6_finetuned_polylex_malagasy` is a English model originally trained by polylexmg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_greek_uncased_v6_finetuned_polylex_malagasy_en_5.1.1_3.0_1694649912799.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_greek_uncased_v6_finetuned_polylex_malagasy_en_5.1.1_3.0_1694649912799.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_greek_uncased_v6_finetuned_polylex_malagasy","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_greek_uncased_v6_finetuned_polylex_malagasy", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_greek_uncased_v6_finetuned_polylex_malagasy| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|421.1 MB| + +## References + +https://huggingface.co/polylexmg/bert-base-greek-uncased-v6-finetuned-polylex-mg \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_kor_v1_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_kor_v1_en.md new file mode 100644 index 00000000000000..687a9c8126d399 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_kor_v1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_kor_v1 BertEmbeddings from bongsoo +author: John Snow Labs +name: bert_base_kor_v1 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_kor_v1` is a English model originally trained by bongsoo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_kor_v1_en_5.1.1_3.0_1694653204759.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_kor_v1_en_5.1.1_3.0_1694653204759.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_kor_v1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_kor_v1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_kor_v1| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|405.6 MB| + +## References + +https://huggingface.co/bongsoo/bert-base-kor-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_minipile_128_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_minipile_128_en.md new file mode 100644 index 00000000000000..f2cd21d687e58e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_minipile_128_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_minipile_128 BertEmbeddings from seba +author: John Snow Labs +name: bert_base_minipile_128 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_minipile_128` is a English model originally trained by seba. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_minipile_128_en_5.1.1_3.0_1694666014699.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_minipile_128_en_5.1.1_3.0_1694666014699.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_minipile_128","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_minipile_128", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_minipile_128| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|402.6 MB| + +## References + +https://huggingface.co/seba/bert-base-minipile-128 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_spanish_wwm_cased_finetuned_peppa_pig_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_spanish_wwm_cased_finetuned_peppa_pig_en.md new file mode 100644 index 00000000000000..24a4d3ea8867ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_spanish_wwm_cased_finetuned_peppa_pig_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_spanish_wwm_cased_finetuned_peppa_pig BertEmbeddings from guidoivetta +author: John Snow Labs +name: bert_base_spanish_wwm_cased_finetuned_peppa_pig +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_spanish_wwm_cased_finetuned_peppa_pig` is a English model originally trained by guidoivetta. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_cased_finetuned_peppa_pig_en_5.1.1_3.0_1694669728508.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_cased_finetuned_peppa_pig_en_5.1.1_3.0_1694669728508.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_spanish_wwm_cased_finetuned_peppa_pig","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_spanish_wwm_cased_finetuned_peppa_pig", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_spanish_wwm_cased_finetuned_peppa_pig| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/guidoivetta/bert-base-spanish-wwm-cased-finetuned-peppa-pig \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_spanish_wwm_cased_finetuned_wine_reviews_spanish_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_spanish_wwm_cased_finetuned_wine_reviews_spanish_en.md new file mode 100644 index 00000000000000..3c0e42b22f109e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_spanish_wwm_cased_finetuned_wine_reviews_spanish_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_spanish_wwm_cased_finetuned_wine_reviews_spanish BertEmbeddings from guidoivetta +author: John Snow Labs +name: bert_base_spanish_wwm_cased_finetuned_wine_reviews_spanish +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_spanish_wwm_cased_finetuned_wine_reviews_spanish` is a English model originally trained by guidoivetta. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_cased_finetuned_wine_reviews_spanish_en_5.1.1_3.0_1694669839262.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_cased_finetuned_wine_reviews_spanish_en_5.1.1_3.0_1694669839262.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_spanish_wwm_cased_finetuned_wine_reviews_spanish","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_spanish_wwm_cased_finetuned_wine_reviews_spanish", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_spanish_wwm_cased_finetuned_wine_reviews_spanish| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/guidoivetta/bert-base-spanish-wwm-cased-finetuned-wine-reviews_spanish \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_duplicate_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_duplicate_en.md new file mode 100644 index 00000000000000..80597bd7dbab61 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_duplicate_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_duplicate BertEmbeddings from julien-c +author: John Snow Labs +name: bert_base_uncased_duplicate +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_duplicate` is a English model originally trained by julien-c. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_duplicate_en_5.1.1_3.0_1694665842003.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_duplicate_en_5.1.1_3.0_1694665842003.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_duplicate","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_duplicate", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_duplicate| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/julien-c/bert-base-uncased-duplicate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_char_hangman_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_char_hangman_en.md new file mode 100644 index 00000000000000..16454bec700c98 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_char_hangman_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_char_hangman BertEmbeddings from bhagasra-saurav +author: John Snow Labs +name: bert_base_uncased_finetuned_char_hangman +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_char_hangman` is a English model originally trained by bhagasra-saurav. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_char_hangman_en_5.1.1_3.0_1694659746036.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_char_hangman_en_5.1.1_3.0_1694659746036.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_char_hangman","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_char_hangman", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_char_hangman| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/bhagasra-saurav/bert-base-uncased-finetuned-char-hangman \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_en.md new file mode 100644 index 00000000000000..4c56d62315f39d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_wallisian BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_uncased_finetuned_wallisian +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_wallisian` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_wallisian_en_5.1.1_3.0_1694653293253.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_wallisian_en_5.1.1_3.0_1694653293253.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_wallisian","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_wallisian", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_wallisian| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/btamm12/bert-base-uncased-finetuned-wls \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_lower_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_lower_en.md new file mode 100644 index 00000000000000..98fcde4093640e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_lower_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_wallisian_lower BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_uncased_finetuned_wallisian_lower +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_wallisian_lower` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_wallisian_lower_en_5.1.1_3.0_1694653735399.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_wallisian_lower_en_5.1.1_3.0_1694653735399.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_wallisian_lower","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_wallisian_lower", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_wallisian_lower| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/btamm12/bert-base-uncased-finetuned-wls-lower \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_manual_1ep_lower_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_manual_1ep_lower_en.md new file mode 100644 index 00000000000000..039e69a18f7a89 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_manual_1ep_lower_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_wallisian_manual_1ep_lower BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_uncased_finetuned_wallisian_manual_1ep_lower +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_wallisian_manual_1ep_lower` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_wallisian_manual_1ep_lower_en_5.1.1_3.0_1694671374906.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_wallisian_manual_1ep_lower_en_5.1.1_3.0_1694671374906.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_wallisian_manual_1ep_lower","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_wallisian_manual_1ep_lower", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_wallisian_manual_1ep_lower| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/btamm12/bert-base-uncased-finetuned-wls-manual-1ep-lower \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_manual_2ep_lower_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_manual_2ep_lower_en.md new file mode 100644 index 00000000000000..3311fb19c57a28 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_manual_2ep_lower_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_wallisian_manual_2ep_lower BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_uncased_finetuned_wallisian_manual_2ep_lower +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_wallisian_manual_2ep_lower` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_wallisian_manual_2ep_lower_en_5.1.1_3.0_1694671592012.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_wallisian_manual_2ep_lower_en_5.1.1_3.0_1694671592012.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_wallisian_manual_2ep_lower","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_wallisian_manual_2ep_lower", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_wallisian_manual_2ep_lower| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/btamm12/bert-base-uncased-finetuned-wls-manual-2ep-lower \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_manual_3ep_lower_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_manual_3ep_lower_en.md new file mode 100644 index 00000000000000..2d5fd0bf93e234 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_manual_3ep_lower_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_wallisian_manual_3ep_lower BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_uncased_finetuned_wallisian_manual_3ep_lower +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_wallisian_manual_3ep_lower` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_wallisian_manual_3ep_lower_en_5.1.1_3.0_1694671806791.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_wallisian_manual_3ep_lower_en_5.1.1_3.0_1694671806791.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_wallisian_manual_3ep_lower","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_wallisian_manual_3ep_lower", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_wallisian_manual_3ep_lower| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/btamm12/bert-base-uncased-finetuned-wls-manual-3ep-lower \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_manual_4ep_lower_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_manual_4ep_lower_en.md new file mode 100644 index 00000000000000..7f501fa3d04407 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_manual_4ep_lower_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_wallisian_manual_4ep_lower BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_uncased_finetuned_wallisian_manual_4ep_lower +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_wallisian_manual_4ep_lower` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_wallisian_manual_4ep_lower_en_5.1.1_3.0_1694672025708.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_wallisian_manual_4ep_lower_en_5.1.1_3.0_1694672025708.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_wallisian_manual_4ep_lower","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_wallisian_manual_4ep_lower", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_wallisian_manual_4ep_lower| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/btamm12/bert-base-uncased-finetuned-wls-manual-4ep-lower \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_manual_5ep_lower_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_manual_5ep_lower_en.md new file mode 100644 index 00000000000000..be3bcbd50aaa7b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_manual_5ep_lower_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_wallisian_manual_5ep_lower BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_uncased_finetuned_wallisian_manual_5ep_lower +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_wallisian_manual_5ep_lower` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_wallisian_manual_5ep_lower_en_5.1.1_3.0_1694672247698.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_wallisian_manual_5ep_lower_en_5.1.1_3.0_1694672247698.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_wallisian_manual_5ep_lower","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_wallisian_manual_5ep_lower", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_wallisian_manual_5ep_lower| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/btamm12/bert-base-uncased-finetuned-wls-manual-5ep-lower \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_manual_6ep_lower_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_manual_6ep_lower_en.md new file mode 100644 index 00000000000000..7e0b0f06d1899d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_manual_6ep_lower_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_wallisian_manual_6ep_lower BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_uncased_finetuned_wallisian_manual_6ep_lower +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_wallisian_manual_6ep_lower` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_wallisian_manual_6ep_lower_en_5.1.1_3.0_1694672466472.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_wallisian_manual_6ep_lower_en_5.1.1_3.0_1694672466472.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_wallisian_manual_6ep_lower","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_wallisian_manual_6ep_lower", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_wallisian_manual_6ep_lower| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/btamm12/bert-base-uncased-finetuned-wls-manual-6ep-lower \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_manual_7ep_lower_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_manual_7ep_lower_en.md new file mode 100644 index 00000000000000..3c8ebc6681e002 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_manual_7ep_lower_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_wallisian_manual_7ep_lower BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_uncased_finetuned_wallisian_manual_7ep_lower +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_wallisian_manual_7ep_lower` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_wallisian_manual_7ep_lower_en_5.1.1_3.0_1694672684834.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_wallisian_manual_7ep_lower_en_5.1.1_3.0_1694672684834.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_wallisian_manual_7ep_lower","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_wallisian_manual_7ep_lower", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_wallisian_manual_7ep_lower| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/btamm12/bert-base-uncased-finetuned-wls-manual-7ep-lower \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_manual_8ep_lower_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_manual_8ep_lower_en.md new file mode 100644 index 00000000000000..aaba40ad3784d7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wallisian_manual_8ep_lower_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_wallisian_manual_8ep_lower BertEmbeddings from btamm12 +author: John Snow Labs +name: bert_base_uncased_finetuned_wallisian_manual_8ep_lower +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_wallisian_manual_8ep_lower` is a English model originally trained by btamm12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_wallisian_manual_8ep_lower_en_5.1.1_3.0_1694672900256.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_wallisian_manual_8ep_lower_en_5.1.1_3.0_1694672900256.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_wallisian_manual_8ep_lower","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_wallisian_manual_8ep_lower", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_wallisian_manual_8ep_lower| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/btamm12/bert-base-uncased-finetuned-wls-manual-8ep-lower \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wikitext_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wikitext_en.md new file mode 100644 index 00000000000000..0c88f303be7b25 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_finetuned_wikitext_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_wikitext BertEmbeddings from peteryushunli +author: John Snow Labs +name: bert_base_uncased_finetuned_wikitext +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_wikitext` is a English model originally trained by peteryushunli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_wikitext_en_5.1.1_3.0_1694667435437.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_wikitext_en_5.1.1_3.0_1694667435437.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_finetuned_wikitext","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_finetuned_wikitext", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_wikitext| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/peteryushunli/bert-base-uncased-finetuned-wikitext \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_issues_128_abhilashawasthi_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_issues_128_abhilashawasthi_en.md new file mode 100644 index 00000000000000..1163955e9e3162 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_issues_128_abhilashawasthi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_issues_128_abhilashawasthi BertEmbeddings from abhilashawasthi +author: John Snow Labs +name: bert_base_uncased_issues_128_abhilashawasthi +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_issues_128_abhilashawasthi` is a English model originally trained by abhilashawasthi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_abhilashawasthi_en_5.1.1_3.0_1694658897200.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_abhilashawasthi_en_5.1.1_3.0_1694658897200.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_issues_128_abhilashawasthi","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_issues_128_abhilashawasthi", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_issues_128_abhilashawasthi| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/abhilashawasthi/bert-base-uncased-issues-128 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_issues_128_bh8648_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_issues_128_bh8648_en.md new file mode 100644 index 00000000000000..b462324ad6a751 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_issues_128_bh8648_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_issues_128_bh8648 BertEmbeddings from bh8648 +author: John Snow Labs +name: bert_base_uncased_issues_128_bh8648 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_issues_128_bh8648` is a English model originally trained by bh8648. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_bh8648_en_5.1.1_3.0_1694652857821.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_bh8648_en_5.1.1_3.0_1694652857821.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_issues_128_bh8648","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_issues_128_bh8648", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_issues_128_bh8648| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/bh8648/bert-base-uncased-issues-128 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_issues_128_mabrouk_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_issues_128_mabrouk_en.md new file mode 100644 index 00000000000000..01075aee7ecbd6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_issues_128_mabrouk_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_issues_128_mabrouk BertEmbeddings from mabrouk +author: John Snow Labs +name: bert_base_uncased_issues_128_mabrouk +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_issues_128_mabrouk` is a English model originally trained by mabrouk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_mabrouk_en_5.1.1_3.0_1694651316234.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_mabrouk_en_5.1.1_3.0_1694651316234.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_issues_128_mabrouk","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_issues_128_mabrouk", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_issues_128_mabrouk| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/mabrouk/bert-base-uncased-issues-128 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_issues_128_reaverlee_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_issues_128_reaverlee_en.md new file mode 100644 index 00000000000000..f288864d7ffbf1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_issues_128_reaverlee_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_issues_128_reaverlee BertEmbeddings from reaverlee +author: John Snow Labs +name: bert_base_uncased_issues_128_reaverlee +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_issues_128_reaverlee` is a English model originally trained by reaverlee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_reaverlee_en_5.1.1_3.0_1694659370450.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_reaverlee_en_5.1.1_3.0_1694659370450.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_issues_128_reaverlee","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_issues_128_reaverlee", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_issues_128_reaverlee| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/reaverlee/bert-base-uncased-issues-128 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_issues_128_veeps_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_issues_128_veeps_en.md new file mode 100644 index 00000000000000..d44fce6405fd5f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_issues_128_veeps_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_issues_128_veeps BertEmbeddings from veeps +author: John Snow Labs +name: bert_base_uncased_issues_128_veeps +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_issues_128_veeps` is a English model originally trained by veeps. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_veeps_en_5.1.1_3.0_1694652544895.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_veeps_en_5.1.1_3.0_1694652544895.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_issues_128_veeps","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_issues_128_veeps", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_issues_128_veeps| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/veeps/bert-base-uncased-issues-128 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_mlm_scirepeval_fos_chemistry_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_mlm_scirepeval_fos_chemistry_en.md new file mode 100644 index 00000000000000..f7b0cea77a16f7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_mlm_scirepeval_fos_chemistry_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_mlm_scirepeval_fos_chemistry BertEmbeddings from jonas-luehrs +author: John Snow Labs +name: bert_base_uncased_mlm_scirepeval_fos_chemistry +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_mlm_scirepeval_fos_chemistry` is a English model originally trained by jonas-luehrs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mlm_scirepeval_fos_chemistry_en_5.1.1_3.0_1694658556245.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mlm_scirepeval_fos_chemistry_en_5.1.1_3.0_1694658556245.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_mlm_scirepeval_fos_chemistry","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_mlm_scirepeval_fos_chemistry", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_mlm_scirepeval_fos_chemistry| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/jonas-luehrs/bert-base-uncased-MLM-scirepeval_fos_chemistry \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_mlp_scirepeval_chemistry_large_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_mlp_scirepeval_chemistry_large_en.md new file mode 100644 index 00000000000000..aa64f8586a81d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_mlp_scirepeval_chemistry_large_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_mlp_scirepeval_chemistry_large BertEmbeddings from jonas-luehrs +author: John Snow Labs +name: bert_base_uncased_mlp_scirepeval_chemistry_large +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_mlp_scirepeval_chemistry_large` is a English model originally trained by jonas-luehrs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mlp_scirepeval_chemistry_large_en_5.1.1_3.0_1694663054967.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mlp_scirepeval_chemistry_large_en_5.1.1_3.0_1694663054967.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_mlp_scirepeval_chemistry_large","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_mlp_scirepeval_chemistry_large", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_mlp_scirepeval_chemistry_large| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/jonas-luehrs/bert-base-uncased-MLP-scirepeval-chemistry-LARGE \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_narsil_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_narsil_en.md new file mode 100644 index 00000000000000..0112a7f88adebe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_narsil_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_narsil BertEmbeddings from Narsil +author: John Snow Labs +name: bert_base_uncased_narsil +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_narsil` is a English model originally trained by Narsil. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_narsil_en_5.1.1_3.0_1694650332859.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_narsil_en_5.1.1_3.0_1694650332859.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_narsil","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_narsil", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_narsil| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/Narsil/bert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_reviews_128_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_reviews_128_en.md new file mode 100644 index 00000000000000..40dc7540f16c97 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_uncased_reviews_128_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_reviews_128 BertEmbeddings from abhilashawasthi +author: John Snow Labs +name: bert_base_uncased_reviews_128 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_reviews_128` is a English model originally trained by abhilashawasthi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_reviews_128_en_5.1.1_3.0_1694659261358.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_reviews_128_en_5.1.1_3.0_1694659261358.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_reviews_128","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_reviews_128", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_reviews_128| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/abhilashawasthi/bert-base-uncased-reviews-128 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_base_wikitext_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_base_wikitext_en.md new file mode 100644 index 00000000000000..cb7988784efdd9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_base_wikitext_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_wikitext BertEmbeddings from nicholasKluge +author: John Snow Labs +name: bert_base_wikitext +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_wikitext` is a English model originally trained by nicholasKluge. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_wikitext_en_5.1.1_3.0_1694661280094.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_wikitext_en_5.1.1_3.0_1694661280094.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_wikitext","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_wikitext", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_wikitext| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.0 MB| + +## References + +https://huggingface.co/nicholasKluge/bert-base-wikitext \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_based_ner_models_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_based_ner_models_en.md new file mode 100644 index 00000000000000..281aeae64de7c3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_based_ner_models_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_based_ner_models BertEmbeddings from pragnakalp +author: John Snow Labs +name: bert_based_ner_models +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_based_ner_models` is a English model originally trained by pragnakalp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_based_ner_models_en_5.1.1_3.0_1694662008683.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_based_ner_models_en_5.1.1_3.0_1694662008683.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_based_ner_models","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_based_ner_models", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_based_ner_models| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/pragnakalp/bert_based_ner_models \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_10_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_10_en.md new file mode 100644 index 00000000000000..cf5335e2fe1041 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_10_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_10 BertEmbeddings from jojoUla +author: John Snow Labs +name: bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_10 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_10` is a English model originally trained by jojoUla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_10_en_5.1.1_3.0_1694657886347.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_10_en_5.1.1_3.0_1694657886347.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_10","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_10", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_10| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/jojoUla/bert-large-cased-sigir-support-refute-no-label-40-2nd-test-LR10-8-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_11_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_11_en.md new file mode 100644 index 00000000000000..cabfb54e78437c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_11_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_11 BertEmbeddings from jojoUla +author: John Snow Labs +name: bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_11 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_11` is a English model originally trained by jojoUla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_11_en_5.1.1_3.0_1694658960750.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_11_en_5.1.1_3.0_1694658960750.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_11","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_11", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_11| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/jojoUla/bert-large-cased-sigir-support-refute-no-label-40-2nd-test-LR10-8-11 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_12_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_12_en.md new file mode 100644 index 00000000000000..89d91289fa6aae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_12_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_12 BertEmbeddings from jojoUla +author: John Snow Labs +name: bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_12 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_12` is a English model originally trained by jojoUla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_12_en_5.1.1_3.0_1694660226986.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_12_en_5.1.1_3.0_1694660226986.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_12","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_12", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_12| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/jojoUla/bert-large-cased-sigir-support-refute-no-label-40-2nd-test-LR10-8-12 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_13_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_13_en.md new file mode 100644 index 00000000000000..1f323513727c72 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_13_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_13 BertEmbeddings from jojoUla +author: John Snow Labs +name: bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_13 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_13` is a English model originally trained by jojoUla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_13_en_5.1.1_3.0_1694660823117.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_13_en_5.1.1_3.0_1694660823117.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_13","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_13", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_13| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/jojoUla/bert-large-cased-sigir-support-refute-no-label-40-2nd-test-LR10-8-13 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_20_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_20_en.md new file mode 100644 index 00000000000000..31f98a8bf31143 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_20_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_20 BertEmbeddings from jojoUla +author: John Snow Labs +name: bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_20 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_20` is a English model originally trained by jojoUla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_20_en_5.1.1_3.0_1694661462547.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_20_en_5.1.1_3.0_1694661462547.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_20","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_20", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_20| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/jojoUla/bert-large-cased-sigir-support-refute-no-label-40-2nd-test-LR10-8-20 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_5_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_5_en.md new file mode 100644 index 00000000000000..8b0420e5ba0fee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_5 BertEmbeddings from jojoUla +author: John Snow Labs +name: bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_5 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_5` is a English model originally trained by jojoUla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_5_en_5.1.1_3.0_1694654880451.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_5_en_5.1.1_3.0_1694654880451.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_5| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/jojoUla/bert-large-cased-sigir-support-refute-no-label-40-2nd-test-LR10-8-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_6_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_6_en.md new file mode 100644 index 00000000000000..5a4eaa348304a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_6_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_6 BertEmbeddings from jojoUla +author: John Snow Labs +name: bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_6 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_6` is a English model originally trained by jojoUla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_6_en_5.1.1_3.0_1694655730374.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_6_en_5.1.1_3.0_1694655730374.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_6","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_6", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_6| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/jojoUla/bert-large-cased-sigir-support-refute-no-label-40-2nd-test-LR10-8-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_7_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_7_en.md new file mode 100644 index 00000000000000..ca225039f9ad3a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_7_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_7 BertEmbeddings from jojoUla +author: John Snow Labs +name: bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_7 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_7` is a English model originally trained by jojoUla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_7_en_5.1.1_3.0_1694656705176.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_7_en_5.1.1_3.0_1694656705176.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_7","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_7", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_8_7| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/jojoUla/bert-large-cased-sigir-support-refute-no-label-40-2nd-test-LR10-8-7 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_large_nordic_pile_1m_steps_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_large_nordic_pile_1m_steps_en.md new file mode 100644 index 00000000000000..7244215eb4cd7e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_large_nordic_pile_1m_steps_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_large_nordic_pile_1m_steps BertEmbeddings from timpal0l +author: John Snow Labs +name: bert_large_nordic_pile_1m_steps +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_nordic_pile_1m_steps` is a English model originally trained by timpal0l. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_nordic_pile_1m_steps_en_5.1.1_3.0_1694666363768.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_nordic_pile_1m_steps_en_5.1.1_3.0_1694666363768.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_large_nordic_pile_1m_steps","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_large_nordic_pile_1m_steps", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_nordic_pile_1m_steps| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.4 GB| + +## References + +https://huggingface.co/timpal0l/bert-large-nordic-pile-1M-steps \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_large_nordic_pile_1m_steps_sv.md b/docs/_posts/ahmedlone127/2023-09-14-bert_large_nordic_pile_1m_steps_sv.md new file mode 100644 index 00000000000000..d3b6c68bf52137 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_large_nordic_pile_1m_steps_sv.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Swedish bert_large_nordic_pile_1m_steps BertEmbeddings from AI-Sweden-Models +author: John Snow Labs +name: bert_large_nordic_pile_1m_steps +date: 2023-09-14 +tags: [bert, sv, open_source, fill_mask, onnx] +task: Embeddings +language: sv +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_nordic_pile_1m_steps` is a Swedish model originally trained by AI-Sweden-Models. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_nordic_pile_1m_steps_sv_5.1.1_3.0_1694666623839.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_nordic_pile_1m_steps_sv_5.1.1_3.0_1694666623839.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_large_nordic_pile_1m_steps","sv") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_large_nordic_pile_1m_steps", "sv") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_nordic_pile_1m_steps| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|sv| +|Size:|1.4 GB| + +## References + +https://huggingface.co/AI-Sweden-Models/bert-large-nordic-pile-1M-steps \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_large_stackoverflow_comments_1m_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_large_stackoverflow_comments_1m_en.md new file mode 100644 index 00000000000000..97a55a6c69aebc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_large_stackoverflow_comments_1m_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_large_stackoverflow_comments_1m BertEmbeddings from giganticode +author: John Snow Labs +name: bert_large_stackoverflow_comments_1m +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_stackoverflow_comments_1m` is a English model originally trained by giganticode. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_stackoverflow_comments_1m_en_5.1.1_3.0_1694650551787.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_stackoverflow_comments_1m_en_5.1.1_3.0_1694650551787.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_large_stackoverflow_comments_1m","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_large_stackoverflow_comments_1m", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_stackoverflow_comments_1m| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/giganticode/bert-large-StackOverflow-comments_1M \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_multilang_finetune_bangla_summarization_dataset_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_multilang_finetune_bangla_summarization_dataset_en.md new file mode 100644 index 00000000000000..bc6fbe813d6aff --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_multilang_finetune_bangla_summarization_dataset_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_multilang_finetune_bangla_summarization_dataset BertEmbeddings from arbitropy +author: John Snow Labs +name: bert_multilang_finetune_bangla_summarization_dataset +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_multilang_finetune_bangla_summarization_dataset` is a English model originally trained by arbitropy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_multilang_finetune_bangla_summarization_dataset_en_5.1.1_3.0_1694667323120.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_multilang_finetune_bangla_summarization_dataset_en_5.1.1_3.0_1694667323120.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_multilang_finetune_bangla_summarization_dataset","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_multilang_finetune_bangla_summarization_dataset", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_multilang_finetune_bangla_summarization_dataset| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.0 MB| + +## References + +https://huggingface.co/arbitropy/bert-multilang-finetune-bangla-summarization-dataset \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_nlp_project_google_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_nlp_project_google_en.md new file mode 100644 index 00000000000000..5ef87ba213725d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_nlp_project_google_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_nlp_project_google BertEmbeddings from jestemleon +author: John Snow Labs +name: bert_nlp_project_google +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_nlp_project_google` is a English model originally trained by jestemleon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_nlp_project_google_en_5.1.1_3.0_1694661280346.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_nlp_project_google_en_5.1.1_3.0_1694661280346.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_nlp_project_google","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_nlp_project_google", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_nlp_project_google| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/jestemleon/bert-nlp-project-google \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_nlp_project_imdb_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_nlp_project_imdb_en.md new file mode 100644 index 00000000000000..3cd3ae7bdcc554 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_nlp_project_imdb_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_nlp_project_imdb BertEmbeddings from jestemleon +author: John Snow Labs +name: bert_nlp_project_imdb +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_nlp_project_imdb` is a English model originally trained by jestemleon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_nlp_project_imdb_en_5.1.1_3.0_1694659623230.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_nlp_project_imdb_en_5.1.1_3.0_1694659623230.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_nlp_project_imdb","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_nlp_project_imdb", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_nlp_project_imdb| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/jestemleon/bert-nlp-project-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_system_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_system_en.md new file mode 100644 index 00000000000000..9a861bdd9c3c82 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_system_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_system BertEmbeddings from EgilKarlsen +author: John Snow Labs +name: bert_system +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_system` is a English model originally trained by EgilKarlsen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_system_en_5.1.1_3.0_1694654994124.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_system_en_5.1.1_3.0_1694654994124.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_system","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_system", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_system| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/EgilKarlsen/BERT-System \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bert_ucb_v1_en.md b/docs/_posts/ahmedlone127/2023-09-14-bert_ucb_v1_en.md new file mode 100644 index 00000000000000..d75034ad344dfa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bert_ucb_v1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ucb_v1 BertEmbeddings from Diegomejia +author: John Snow Labs +name: bert_ucb_v1 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ucb_v1` is a English model originally trained by Diegomejia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ucb_v1_en_5.1.1_3.0_1694664280264.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ucb_v1_en_5.1.1_3.0_1694664280264.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_ucb_v1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_ucb_v1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ucb_v1| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/Diegomejia/bert-ucb-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bertimbau_pt.md b/docs/_posts/ahmedlone127/2023-09-14-bertimbau_pt.md new file mode 100644 index 00000000000000..6a2709eb243966 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bertimbau_pt.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Portuguese bertimbau BertEmbeddings from tubyneto +author: John Snow Labs +name: bertimbau +date: 2023-09-14 +tags: [bert, pt, open_source, fill_mask, onnx] +task: Embeddings +language: pt +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bertimbau` is a Portuguese model originally trained by tubyneto. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bertimbau_pt_5.1.1_3.0_1694664877599.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bertimbau_pt_5.1.1_3.0_1694664877599.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bertimbau","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bertimbau", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bertimbau| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|pt| +|Size:|405.9 MB| + +## References + +https://huggingface.co/tubyneto/bertimbau \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bertugues_base_portuguese_cased_pt.md b/docs/_posts/ahmedlone127/2023-09-14-bertugues_base_portuguese_cased_pt.md new file mode 100644 index 00000000000000..28838eb95b7cf8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bertugues_base_portuguese_cased_pt.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Portuguese bertugues_base_portuguese_cased BertEmbeddings from ricardoz +author: John Snow Labs +name: bertugues_base_portuguese_cased +date: 2023-09-14 +tags: [bert, pt, open_source, fill_mask, onnx] +task: Embeddings +language: pt +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bertugues_base_portuguese_cased` is a Portuguese model originally trained by ricardoz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bertugues_base_portuguese_cased_pt_5.1.1_3.0_1694650332853.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bertugues_base_portuguese_cased_pt_5.1.1_3.0_1694650332853.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bertugues_base_portuguese_cased","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bertugues_base_portuguese_cased", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bertugues_base_portuguese_cased| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|pt| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ricardoz/BERTugues-base-portuguese-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bibert_v0.1_en.md b/docs/_posts/ahmedlone127/2023-09-14-bibert_v0.1_en.md new file mode 100644 index 00000000000000..c5d5aa8e4af490 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bibert_v0.1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bibert_v0.1 BertEmbeddings from yugen-ok +author: John Snow Labs +name: bibert_v0.1 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bibert_v0.1` is a English model originally trained by yugen-ok. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bibert_v0.1_en_5.1.1_3.0_1694666746036.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bibert_v0.1_en_5.1.1_3.0_1694666746036.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bibert_v0.1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bibert_v0.1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bibert_v0.1| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/yugen-ok/bibert-v0.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-biobit_it.md b/docs/_posts/ahmedlone127/2023-09-14-biobit_it.md new file mode 100644 index 00000000000000..c13cd6256893c1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-biobit_it.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Italian biobit BertEmbeddings from IVN-RIN +author: John Snow Labs +name: biobit +date: 2023-09-14 +tags: [bert, it, open_source, fill_mask, onnx] +task: Embeddings +language: it +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biobit` is a Italian model originally trained by IVN-RIN. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biobit_it_5.1.1_3.0_1694659298869.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biobit_it_5.1.1_3.0_1694659298869.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("biobit","it") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("biobit", "it") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biobit| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|it| +|Size:|409.2 MB| + +## References + +https://huggingface.co/IVN-RIN/bioBIT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-biomedvlp_cxr_bert_general_en.md b/docs/_posts/ahmedlone127/2023-09-14-biomedvlp_cxr_bert_general_en.md new file mode 100644 index 00000000000000..a7f4d323893c69 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-biomedvlp_cxr_bert_general_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biomedvlp_cxr_bert_general BertEmbeddings from microsoft +author: John Snow Labs +name: biomedvlp_cxr_bert_general +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biomedvlp_cxr_bert_general` is a English model originally trained by microsoft. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biomedvlp_cxr_bert_general_en_5.1.1_3.0_1694659655941.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biomedvlp_cxr_bert_general_en_5.1.1_3.0_1694659655941.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("biomedvlp_cxr_bert_general","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("biomedvlp_cxr_bert_general", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biomedvlp_cxr_bert_general| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|260.2 MB| + +## References + +https://huggingface.co/microsoft/BiomedVLP-CXR-BERT-general \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-bnlp_tokenizer_paraphrase_mlm_bert_900001_en.md b/docs/_posts/ahmedlone127/2023-09-14-bnlp_tokenizer_paraphrase_mlm_bert_900001_en.md new file mode 100644 index 00000000000000..44cf910d36bcc3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-bnlp_tokenizer_paraphrase_mlm_bert_900001_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bnlp_tokenizer_paraphrase_mlm_bert_900001 BertEmbeddings from arbitropy +author: John Snow Labs +name: bnlp_tokenizer_paraphrase_mlm_bert_900001 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bnlp_tokenizer_paraphrase_mlm_bert_900001` is a English model originally trained by arbitropy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bnlp_tokenizer_paraphrase_mlm_bert_900001_en_5.1.1_3.0_1694662292369.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bnlp_tokenizer_paraphrase_mlm_bert_900001_en_5.1.1_3.0_1694662292369.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bnlp_tokenizer_paraphrase_mlm_bert_900001","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bnlp_tokenizer_paraphrase_mlm_bert_900001", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bnlp_tokenizer_paraphrase_mlm_bert_900001| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.4 MB| + +## References + +https://huggingface.co/arbitropy/bnlp-tokenizer-paraphrase-mlm-bert-900001 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-clinicaltrialbiobert_en.md b/docs/_posts/ahmedlone127/2023-09-14-clinicaltrialbiobert_en.md new file mode 100644 index 00000000000000..0ec0ca24b6d969 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-clinicaltrialbiobert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English clinicaltrialbiobert BertEmbeddings from domenicrosati +author: John Snow Labs +name: clinicaltrialbiobert +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clinicaltrialbiobert` is a English model originally trained by domenicrosati. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clinicaltrialbiobert_en_5.1.1_3.0_1694660034709.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clinicaltrialbiobert_en_5.1.1_3.0_1694660034709.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("clinicaltrialbiobert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("clinicaltrialbiobert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clinicaltrialbiobert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|402.8 MB| + +## References + +https://huggingface.co/domenicrosati/ClinicalTrialBioBert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-closure_system_door_inne_bert_base_uncased_en.md b/docs/_posts/ahmedlone127/2023-09-14-closure_system_door_inne_bert_base_uncased_en.md new file mode 100644 index 00000000000000..dd9a2cd0616120 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-closure_system_door_inne_bert_base_uncased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English closure_system_door_inne_bert_base_uncased BertEmbeddings from Davincilee +author: John Snow Labs +name: closure_system_door_inne_bert_base_uncased +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`closure_system_door_inne_bert_base_uncased` is a English model originally trained by Davincilee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/closure_system_door_inne_bert_base_uncased_en_5.1.1_3.0_1694654562553.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/closure_system_door_inne_bert_base_uncased_en_5.1.1_3.0_1694654562553.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("closure_system_door_inne_bert_base_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("closure_system_door_inne_bert_base_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|closure_system_door_inne_bert_base_uncased| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/Davincilee/closure_system_door_inne-bert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-coronasentana_en.md b/docs/_posts/ahmedlone127/2023-09-14-coronasentana_en.md new file mode 100644 index 00000000000000..870dbc60f9b555 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-coronasentana_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English coronasentana BertEmbeddings from Peed911 +author: John Snow Labs +name: coronasentana +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`coronasentana` is a English model originally trained by Peed911. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/coronasentana_en_5.1.1_3.0_1694668810910.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/coronasentana_en_5.1.1_3.0_1694668810910.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("coronasentana","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("coronasentana", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|coronasentana| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|402.3 MB| + +## References + +https://huggingface.co/Peed911/CoronaSentAna \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-dictabert_he.md b/docs/_posts/ahmedlone127/2023-09-14-dictabert_he.md new file mode 100644 index 00000000000000..832d40238defcf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-dictabert_he.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Hebrew dictabert BertEmbeddings from dicta-il +author: John Snow Labs +name: dictabert +date: 2023-09-14 +tags: [bert, he, open_source, fill_mask, onnx] +task: Embeddings +language: he +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dictabert` is a Hebrew model originally trained by dicta-il. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dictabert_he_5.1.1_3.0_1694668081428.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dictabert_he_5.1.1_3.0_1694668081428.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dictabert","he") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dictabert", "he") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dictabert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|he| +|Size:|440.2 MB| + +## References + +https://huggingface.co/dicta-il/dictabert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-dictabert_morph_he.md b/docs/_posts/ahmedlone127/2023-09-14-dictabert_morph_he.md new file mode 100644 index 00000000000000..c9fda4c69f2b7c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-dictabert_morph_he.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Hebrew dictabert_morph BertEmbeddings from dicta-il +author: John Snow Labs +name: dictabert_morph +date: 2023-09-14 +tags: [bert, he, open_source, fill_mask, onnx] +task: Embeddings +language: he +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dictabert_morph` is a Hebrew model originally trained by dicta-il. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dictabert_morph_he_5.1.1_3.0_1694668253883.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dictabert_morph_he_5.1.1_3.0_1694668253883.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dictabert_morph","he") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dictabert_morph", "he") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dictabert_morph| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|he| +|Size:|639.8 MB| + +## References + +https://huggingface.co/dicta-il/dictabert-morph \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-dictabert_seg_he.md b/docs/_posts/ahmedlone127/2023-09-14-dictabert_seg_he.md new file mode 100644 index 00000000000000..1d928a59e56a6f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-dictabert_seg_he.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Hebrew dictabert_seg BertEmbeddings from dicta-il +author: John Snow Labs +name: dictabert_seg +date: 2023-09-14 +tags: [bert, he, open_source, fill_mask, onnx] +task: Embeddings +language: he +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dictabert_seg` is a Hebrew model originally trained by dicta-il. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dictabert_seg_he_5.1.1_3.0_1694667839991.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dictabert_seg_he_5.1.1_3.0_1694667839991.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dictabert_seg","he") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dictabert_seg", "he") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dictabert_seg| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|he| +|Size:|657.5 MB| + +## References + +https://huggingface.co/dicta-il/dictabert-seg \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-domain_adapted_arbert_goudma_bert_en.md b/docs/_posts/ahmedlone127/2023-09-14-domain_adapted_arbert_goudma_bert_en.md new file mode 100644 index 00000000000000..e5a0661b51bbf4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-domain_adapted_arbert_goudma_bert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English domain_adapted_arbert_goudma_bert BertEmbeddings from YassineToughrai +author: John Snow Labs +name: domain_adapted_arbert_goudma_bert +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`domain_adapted_arbert_goudma_bert` is a English model originally trained by YassineToughrai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/domain_adapted_arbert_goudma_bert_en_5.1.1_3.0_1694654039836.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/domain_adapted_arbert_goudma_bert_en_5.1.1_3.0_1694654039836.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("domain_adapted_arbert_goudma_bert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("domain_adapted_arbert_goudma_bert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|domain_adapted_arbert_goudma_bert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|605.0 MB| + +## References + +https://huggingface.co/YassineToughrai/Domain_adapted_ARBERT_GOUDMA_BERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-door_inner_with_sa_bert_base_uncased_en.md b/docs/_posts/ahmedlone127/2023-09-14-door_inner_with_sa_bert_base_uncased_en.md new file mode 100644 index 00000000000000..e2e74acf1e1674 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-door_inner_with_sa_bert_base_uncased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English door_inner_with_sa_bert_base_uncased BertEmbeddings from Davincilee +author: John Snow Labs +name: door_inner_with_sa_bert_base_uncased +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`door_inner_with_sa_bert_base_uncased` is a English model originally trained by Davincilee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/door_inner_with_sa_bert_base_uncased_en_5.1.1_3.0_1694656043395.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/door_inner_with_sa_bert_base_uncased_en_5.1.1_3.0_1694656043395.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("door_inner_with_sa_bert_base_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("door_inner_with_sa_bert_base_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|door_inner_with_sa_bert_base_uncased| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/Davincilee/door_inner_with_SA-bert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-dummy_model_linbo_en.md b/docs/_posts/ahmedlone127/2023-09-14-dummy_model_linbo_en.md new file mode 100644 index 00000000000000..7f5458cb5f124f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-dummy_model_linbo_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dummy_model_linbo BertEmbeddings from Linbo +author: John Snow Labs +name: dummy_model_linbo +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dummy_model_linbo` is a English model originally trained by Linbo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dummy_model_linbo_en_5.1.1_3.0_1694649822787.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dummy_model_linbo_en_5.1.1_3.0_1694649822787.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dummy_model_linbo","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dummy_model_linbo", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dummy_model_linbo| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/Linbo/dummy-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-dzarabert_ar.md b/docs/_posts/ahmedlone127/2023-09-14-dzarabert_ar.md new file mode 100644 index 00000000000000..4b1abced8e95db --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-dzarabert_ar.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Arabic dzarabert BertEmbeddings from Sifal +author: John Snow Labs +name: dzarabert +date: 2023-09-14 +tags: [bert, ar, open_source, fill_mask, onnx] +task: Embeddings +language: ar +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dzarabert` is a Arabic model originally trained by Sifal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dzarabert_ar_5.1.1_3.0_1694657627709.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dzarabert_ar_5.1.1_3.0_1694657627709.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("dzarabert","ar") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("dzarabert", "ar") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dzarabert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|ar| +|Size:|418.8 MB| + +## References + +https://huggingface.co/Sifal/dzarabert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-frpile_mlm_en.md b/docs/_posts/ahmedlone127/2023-09-14-frpile_mlm_en.md new file mode 100644 index 00000000000000..6f2bcb829de42a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-frpile_mlm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English frpile_mlm BertEmbeddings from DragosGorduza +author: John Snow Labs +name: frpile_mlm +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`frpile_mlm` is a English model originally trained by DragosGorduza. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/frpile_mlm_en_5.1.1_3.0_1694656401826.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/frpile_mlm_en_5.1.1_3.0_1694656401826.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("frpile_mlm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("frpile_mlm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|frpile_mlm| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/DragosGorduza/FRPile_MLM \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-gbert_base_finetuned_twitter_janst_en.md b/docs/_posts/ahmedlone127/2023-09-14-gbert_base_finetuned_twitter_janst_en.md new file mode 100644 index 00000000000000..a7d41d1877bdf1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-gbert_base_finetuned_twitter_janst_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English gbert_base_finetuned_twitter_janst BertEmbeddings from JanSt +author: John Snow Labs +name: gbert_base_finetuned_twitter_janst +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gbert_base_finetuned_twitter_janst` is a English model originally trained by JanSt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gbert_base_finetuned_twitter_janst_en_5.1.1_3.0_1694666124788.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gbert_base_finetuned_twitter_janst_en_5.1.1_3.0_1694666124788.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("gbert_base_finetuned_twitter_janst","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("gbert_base_finetuned_twitter_janst", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gbert_base_finetuned_twitter_janst| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.7 MB| + +## References + +https://huggingface.co/JanSt/gbert-base-finetuned-twitter \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-gbert_large_autopart_en.md b/docs/_posts/ahmedlone127/2023-09-14-gbert_large_autopart_en.md new file mode 100644 index 00000000000000..fe87dd40038a5d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-gbert_large_autopart_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English gbert_large_autopart BertEmbeddings from luciore95 +author: John Snow Labs +name: gbert_large_autopart +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gbert_large_autopart` is a English model originally trained by luciore95. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gbert_large_autopart_en_5.1.1_3.0_1694669047387.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gbert_large_autopart_en_5.1.1_3.0_1694669047387.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("gbert_large_autopart","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("gbert_large_autopart", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gbert_large_autopart| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/luciore95/gbert-large-autopart \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-gbert_large_finetuned_cust18_en.md b/docs/_posts/ahmedlone127/2023-09-14-gbert_large_finetuned_cust18_en.md new file mode 100644 index 00000000000000..ce2f6db90b1414 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-gbert_large_finetuned_cust18_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English gbert_large_finetuned_cust18 BertEmbeddings from shafin +author: John Snow Labs +name: gbert_large_finetuned_cust18 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gbert_large_finetuned_cust18` is a English model originally trained by shafin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gbert_large_finetuned_cust18_en_5.1.1_3.0_1694664219714.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gbert_large_finetuned_cust18_en_5.1.1_3.0_1694664219714.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("gbert_large_finetuned_cust18","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("gbert_large_finetuned_cust18", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gbert_large_finetuned_cust18| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/shafin/gbert-large-finetuned-cust18 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-gbert_large_finetuned_cust_en.md b/docs/_posts/ahmedlone127/2023-09-14-gbert_large_finetuned_cust_en.md new file mode 100644 index 00000000000000..32310949e01b6b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-gbert_large_finetuned_cust_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English gbert_large_finetuned_cust BertEmbeddings from shafin +author: John Snow Labs +name: gbert_large_finetuned_cust +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gbert_large_finetuned_cust` is a English model originally trained by shafin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gbert_large_finetuned_cust_en_5.1.1_3.0_1694662333720.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gbert_large_finetuned_cust_en_5.1.1_3.0_1694662333720.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("gbert_large_finetuned_cust","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("gbert_large_finetuned_cust", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gbert_large_finetuned_cust| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/shafin/gbert-large-finetuned-cust \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-gepabert_de.md b/docs/_posts/ahmedlone127/2023-09-14-gepabert_de.md new file mode 100644 index 00000000000000..725f463a85f721 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-gepabert_de.md @@ -0,0 +1,93 @@ +--- +layout: model +title: German gepabert BertEmbeddings from aehrm +author: John Snow Labs +name: gepabert +date: 2023-09-14 +tags: [bert, de, open_source, fill_mask, onnx] +task: Embeddings +language: de +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gepabert` is a German model originally trained by aehrm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gepabert_de_5.1.1_3.0_1694654562569.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gepabert_de_5.1.1_3.0_1694654562569.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("gepabert","de") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("gepabert", "de") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gepabert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|de| +|Size:|1.3 GB| + +## References + +https://huggingface.co/aehrm/gepabert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-gujarati_bert_scratch_gu.md b/docs/_posts/ahmedlone127/2023-09-14-gujarati_bert_scratch_gu.md new file mode 100644 index 00000000000000..8a8d4b1e24d396 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-gujarati_bert_scratch_gu.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Gujarati gujarati_bert_scratch BertEmbeddings from l3cube-pune +author: John Snow Labs +name: gujarati_bert_scratch +date: 2023-09-14 +tags: [bert, gu, open_source, fill_mask, onnx] +task: Embeddings +language: gu +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gujarati_bert_scratch` is a Gujarati model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gujarati_bert_scratch_gu_5.1.1_3.0_1694652276670.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gujarati_bert_scratch_gu_5.1.1_3.0_1694652276670.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("gujarati_bert_scratch","gu") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("gujarati_bert_scratch", "gu") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gujarati_bert_scratch| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|gu| +|Size:|470.4 MB| + +## References + +https://huggingface.co/l3cube-pune/gujarati-bert-scratch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-incorporation_of_company_related_factual_knowledge_into_pre_trained_language_models_en.md b/docs/_posts/ahmedlone127/2023-09-14-incorporation_of_company_related_factual_knowledge_into_pre_trained_language_models_en.md new file mode 100644 index 00000000000000..bee6ac16a86e3a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-incorporation_of_company_related_factual_knowledge_into_pre_trained_language_models_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English incorporation_of_company_related_factual_knowledge_into_pre_trained_language_models BertEmbeddings from sophia-jihye +author: John Snow Labs +name: incorporation_of_company_related_factual_knowledge_into_pre_trained_language_models +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`incorporation_of_company_related_factual_knowledge_into_pre_trained_language_models` is a English model originally trained by sophia-jihye. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/incorporation_of_company_related_factual_knowledge_into_pre_trained_language_models_en_5.1.1_3.0_1694667130967.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/incorporation_of_company_related_factual_knowledge_into_pre_trained_language_models_en_5.1.1_3.0_1694667130967.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("incorporation_of_company_related_factual_knowledge_into_pre_trained_language_models","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("incorporation_of_company_related_factual_knowledge_into_pre_trained_language_models", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|incorporation_of_company_related_factual_knowledge_into_pre_trained_language_models| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|406.4 MB| + +## References + +https://huggingface.co/sophia-jihye/Incorporation_of_Company-Related_Factual_Knowledge_into_Pre-trained_Language_Models \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-itd_bert_en.md b/docs/_posts/ahmedlone127/2023-09-14-itd_bert_en.md new file mode 100644 index 00000000000000..f5eccdd79bdf46 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-itd_bert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English itd_bert BertEmbeddings from melll-uff +author: John Snow Labs +name: itd_bert +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`itd_bert` is a English model originally trained by melll-uff. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/itd_bert_en_5.1.1_3.0_1694669236393.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/itd_bert_en_5.1.1_3.0_1694669236393.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("itd_bert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("itd_bert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|itd_bert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|405.3 MB| + +## References + +https://huggingface.co/melll-uff/itd_bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-itd_longformer_en.md b/docs/_posts/ahmedlone127/2023-09-14-itd_longformer_en.md new file mode 100644 index 00000000000000..8a2a462ff6c2d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-itd_longformer_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English itd_longformer BertEmbeddings from melll-uff +author: John Snow Labs +name: itd_longformer +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`itd_longformer` is a English model originally trained by melll-uff. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/itd_longformer_en_5.1.1_3.0_1694669346528.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/itd_longformer_en_5.1.1_3.0_1694669346528.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("itd_longformer","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("itd_longformer", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|itd_longformer| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|415.9 MB| + +## References + +https://huggingface.co/melll-uff/itd_longformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-jobs_pretraining_model_en.md b/docs/_posts/ahmedlone127/2023-09-14-jobs_pretraining_model_en.md new file mode 100644 index 00000000000000..75d3724bbd497b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-jobs_pretraining_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English jobs_pretraining_model BertEmbeddings from afif00 +author: John Snow Labs +name: jobs_pretraining_model +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`jobs_pretraining_model` is a English model originally trained by afif00. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/jobs_pretraining_model_en_5.1.1_3.0_1694661632301.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/jobs_pretraining_model_en_5.1.1_3.0_1694661632301.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("jobs_pretraining_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("jobs_pretraining_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|jobs_pretraining_model| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/afif00/jobs-pretraining-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-kannada_bert_scratch_kn.md b/docs/_posts/ahmedlone127/2023-09-14-kannada_bert_scratch_kn.md new file mode 100644 index 00000000000000..489c54b6534d66 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-kannada_bert_scratch_kn.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Kannada kannada_bert_scratch BertEmbeddings from l3cube-pune +author: John Snow Labs +name: kannada_bert_scratch +date: 2023-09-14 +tags: [bert, kn, open_source, fill_mask, onnx] +task: Embeddings +language: kn +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kannada_bert_scratch` is a Kannada model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kannada_bert_scratch_kn_5.1.1_3.0_1694652710651.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kannada_bert_scratch_kn_5.1.1_3.0_1694652710651.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("kannada_bert_scratch","kn") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("kannada_bert_scratch", "kn") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kannada_bert_scratch| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|kn| +|Size:|470.6 MB| + +## References + +https://huggingface.co/l3cube-pune/kannada-bert-scratch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-klue_bert_mlm_en.md b/docs/_posts/ahmedlone127/2023-09-14-klue_bert_mlm_en.md new file mode 100644 index 00000000000000..a4ec260a075c0f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-klue_bert_mlm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English klue_bert_mlm BertEmbeddings from goodjw +author: John Snow Labs +name: klue_bert_mlm +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`klue_bert_mlm` is a English model originally trained by goodjw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/klue_bert_mlm_en_5.1.1_3.0_1694651022503.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/klue_bert_mlm_en_5.1.1_3.0_1694651022503.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("klue_bert_mlm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("klue_bert_mlm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|klue_bert_mlm| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|412.4 MB| + +## References + +https://huggingface.co/goodjw/klue-bert-mlm \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-kw_pubmed_1000_0.000006_en.md b/docs/_posts/ahmedlone127/2023-09-14-kw_pubmed_1000_0.000006_en.md new file mode 100644 index 00000000000000..95500a982ee892 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-kw_pubmed_1000_0.000006_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English kw_pubmed_1000_0.000006 BertEmbeddings from enoriega +author: John Snow Labs +name: kw_pubmed_1000_0.000006 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kw_pubmed_1000_0.000006` is a English model originally trained by enoriega. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kw_pubmed_1000_0.000006_en_5.1.1_3.0_1694663885237.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kw_pubmed_1000_0.000006_en_5.1.1_3.0_1694663885237.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("kw_pubmed_1000_0.000006","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("kw_pubmed_1000_0.000006", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kw_pubmed_1000_0.000006| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/enoriega/kw_pubmed_1000_0.000006 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-kw_pubmed_1000_0.00006_en.md b/docs/_posts/ahmedlone127/2023-09-14-kw_pubmed_1000_0.00006_en.md new file mode 100644 index 00000000000000..9467e699fed262 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-kw_pubmed_1000_0.00006_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English kw_pubmed_1000_0.00006 BertEmbeddings from enoriega +author: John Snow Labs +name: kw_pubmed_1000_0.00006 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kw_pubmed_1000_0.00006` is a English model originally trained by enoriega. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kw_pubmed_1000_0.00006_en_5.1.1_3.0_1694663539113.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kw_pubmed_1000_0.00006_en_5.1.1_3.0_1694663539113.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("kw_pubmed_1000_0.00006","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("kw_pubmed_1000_0.00006", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kw_pubmed_1000_0.00006| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/enoriega/kw_pubmed_1000_0.00006 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-kw_pubmed_1000_0.0003_en.md b/docs/_posts/ahmedlone127/2023-09-14-kw_pubmed_1000_0.0003_en.md new file mode 100644 index 00000000000000..569429ac101f50 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-kw_pubmed_1000_0.0003_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English kw_pubmed_1000_0.0003 BertEmbeddings from enoriega +author: John Snow Labs +name: kw_pubmed_1000_0.0003 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kw_pubmed_1000_0.0003` is a English model originally trained by enoriega. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kw_pubmed_1000_0.0003_en_5.1.1_3.0_1694663260280.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kw_pubmed_1000_0.0003_en_5.1.1_3.0_1694663260280.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("kw_pubmed_1000_0.0003","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("kw_pubmed_1000_0.0003", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kw_pubmed_1000_0.0003| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/enoriega/kw_pubmed_1000_0.0003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-legal_hebert_ft_en.md b/docs/_posts/ahmedlone127/2023-09-14-legal_hebert_ft_en.md new file mode 100644 index 00000000000000..87a87782f0a4e6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-legal_hebert_ft_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English legal_hebert_ft BertEmbeddings from avichr +author: John Snow Labs +name: legal_hebert_ft +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`legal_hebert_ft` is a English model originally trained by avichr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/legal_hebert_ft_en_5.1.1_3.0_1694656937535.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/legal_hebert_ft_en_5.1.1_3.0_1694656937535.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("legal_hebert_ft","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("legal_hebert_ft", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|legal_hebert_ft| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.9 MB| + +## References + +https://huggingface.co/avichr/Legal-heBERT_ft \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-legalbert_large_1.7m_1_en.md b/docs/_posts/ahmedlone127/2023-09-14-legalbert_large_1.7m_1_en.md new file mode 100644 index 00000000000000..2129067405d6a5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-legalbert_large_1.7m_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English legalbert_large_1.7m_1 BertEmbeddings from pile-of-law +author: John Snow Labs +name: legalbert_large_1.7m_1 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`legalbert_large_1.7m_1` is a English model originally trained by pile-of-law. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/legalbert_large_1.7m_1_en_5.1.1_3.0_1694651512853.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/legalbert_large_1.7m_1_en_5.1.1_3.0_1694651512853.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("legalbert_large_1.7m_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("legalbert_large_1.7m_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|legalbert_large_1.7m_1| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|643.5 MB| + +## References + +https://huggingface.co/pile-of-law/legalbert-large-1.7M-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-legalbert_large_1.7m_2_en.md b/docs/_posts/ahmedlone127/2023-09-14-legalbert_large_1.7m_2_en.md new file mode 100644 index 00000000000000..512842ee82e8d7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-legalbert_large_1.7m_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English legalbert_large_1.7m_2 BertEmbeddings from pile-of-law +author: John Snow Labs +name: legalbert_large_1.7m_2 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`legalbert_large_1.7m_2` is a English model originally trained by pile-of-law. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/legalbert_large_1.7m_2_en_5.1.1_3.0_1694653987323.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/legalbert_large_1.7m_2_en_5.1.1_3.0_1694653987323.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("legalbert_large_1.7m_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("legalbert_large_1.7m_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|legalbert_large_1.7m_2| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|645.3 MB| + +## References + +https://huggingface.co/pile-of-law/legalbert-large-1.7M-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-logion_50k_wordpiece_en.md b/docs/_posts/ahmedlone127/2023-09-14-logion_50k_wordpiece_en.md new file mode 100644 index 00000000000000..18637a4b33e784 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-logion_50k_wordpiece_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English logion_50k_wordpiece BertEmbeddings from cabrooks +author: John Snow Labs +name: logion_50k_wordpiece +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`logion_50k_wordpiece` is a English model originally trained by cabrooks. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/logion_50k_wordpiece_en_5.1.1_3.0_1694661683131.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/logion_50k_wordpiece_en_5.1.1_3.0_1694661683131.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("logion_50k_wordpiece","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("logion_50k_wordpiece", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|logion_50k_wordpiece| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|464.1 MB| + +## References + +https://huggingface.co/cabrooks/LOGION-50k_wordpiece \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-louribert_en.md b/docs/_posts/ahmedlone127/2023-09-14-louribert_en.md new file mode 100644 index 00000000000000..74c2850a318111 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-louribert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English louribert BertEmbeddings from saeid7776 +author: John Snow Labs +name: louribert +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`louribert` is a English model originally trained by saeid7776. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/louribert_en_5.1.1_3.0_1694664381526.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/louribert_en_5.1.1_3.0_1694664381526.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("louribert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("louribert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|louribert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|668.1 MB| + +## References + +https://huggingface.co/saeid7776/LouriBert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-louribert_more_tokens_saeid7776_en.md b/docs/_posts/ahmedlone127/2023-09-14-louribert_more_tokens_saeid7776_en.md new file mode 100644 index 00000000000000..a0dbf24ce80480 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-louribert_more_tokens_saeid7776_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English louribert_more_tokens_saeid7776 BertEmbeddings from saeid7776 +author: John Snow Labs +name: louribert_more_tokens_saeid7776 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`louribert_more_tokens_saeid7776` is a English model originally trained by saeid7776. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/louribert_more_tokens_saeid7776_en_5.1.1_3.0_1694665209272.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/louribert_more_tokens_saeid7776_en_5.1.1_3.0_1694665209272.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("louribert_more_tokens_saeid7776","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("louribert_more_tokens_saeid7776", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|louribert_more_tokens_saeid7776| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|670.8 MB| + +## References + +https://huggingface.co/saeid7776/LouriBert_more_tokens \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-lumbarradiologyreports_en.md b/docs/_posts/ahmedlone127/2023-09-14-lumbarradiologyreports_en.md new file mode 100644 index 00000000000000..bfd7194db7ac69 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-lumbarradiologyreports_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English lumbarradiologyreports BertEmbeddings from YK96 +author: John Snow Labs +name: lumbarradiologyreports +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`lumbarradiologyreports` is a English model originally trained by YK96. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/lumbarradiologyreports_en_5.1.1_3.0_1694669457817.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/lumbarradiologyreports_en_5.1.1_3.0_1694669457817.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("lumbarradiologyreports","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("lumbarradiologyreports", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|lumbarradiologyreports| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/YK96/LumbarRadiologyReports \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-malayalam_bert_scratch_ml.md b/docs/_posts/ahmedlone127/2023-09-14-malayalam_bert_scratch_ml.md new file mode 100644 index 00000000000000..3a39ec6415ec0c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-malayalam_bert_scratch_ml.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Malayalam malayalam_bert_scratch BertEmbeddings from l3cube-pune +author: John Snow Labs +name: malayalam_bert_scratch +date: 2023-09-14 +tags: [bert, ml, open_source, fill_mask, onnx] +task: Embeddings +language: ml +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`malayalam_bert_scratch` is a Malayalam model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/malayalam_bert_scratch_ml_5.1.1_3.0_1694651911081.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/malayalam_bert_scratch_ml_5.1.1_3.0_1694651911081.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("malayalam_bert_scratch","ml") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("malayalam_bert_scratch", "ml") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|malayalam_bert_scratch| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|ml| +|Size:|470.7 MB| + +## References + +https://huggingface.co/l3cube-pune/malayalam-bert-scratch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-mbert_squad_en.md b/docs/_posts/ahmedlone127/2023-09-14-mbert_squad_en.md new file mode 100644 index 00000000000000..13c840d9731b59 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-mbert_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English mbert_squad BertEmbeddings from oceanpty +author: John Snow Labs +name: mbert_squad +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mbert_squad` is a English model originally trained by oceanpty. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mbert_squad_en_5.1.1_3.0_1694652176700.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mbert_squad_en_5.1.1_3.0_1694652176700.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("mbert_squad","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("mbert_squad", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mbert_squad| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.0 MB| + +## References + +https://huggingface.co/oceanpty/mbert-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-medbert_512_norwegian_duplicates_de.md b/docs/_posts/ahmedlone127/2023-09-14-medbert_512_norwegian_duplicates_de.md new file mode 100644 index 00000000000000..4af11bf2aa10bb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-medbert_512_norwegian_duplicates_de.md @@ -0,0 +1,93 @@ +--- +layout: model +title: German medbert_512_norwegian_duplicates BertEmbeddings from GerMedBERT +author: John Snow Labs +name: medbert_512_norwegian_duplicates +date: 2023-09-14 +tags: [bert, de, open_source, fill_mask, onnx] +task: Embeddings +language: de +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`medbert_512_norwegian_duplicates` is a German model originally trained by GerMedBERT. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/medbert_512_norwegian_duplicates_de_5.1.1_3.0_1694654562610.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/medbert_512_norwegian_duplicates_de_5.1.1_3.0_1694654562610.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("medbert_512_norwegian_duplicates","de") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("medbert_512_norwegian_duplicates", "de") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|medbert_512_norwegian_duplicates| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|de| +|Size:|406.8 MB| + +## References + +https://huggingface.co/GerMedBERT/medbert-512-no-duplicates \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-medbit_it.md b/docs/_posts/ahmedlone127/2023-09-14-medbit_it.md new file mode 100644 index 00000000000000..7742c28c348186 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-medbit_it.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Italian medbit BertEmbeddings from IVN-RIN +author: John Snow Labs +name: medbit +date: 2023-09-14 +tags: [bert, it, open_source, fill_mask, onnx] +task: Embeddings +language: it +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`medbit` is a Italian model originally trained by IVN-RIN. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/medbit_it_5.1.1_3.0_1694658556151.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/medbit_it_5.1.1_3.0_1694658556151.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("medbit","it") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("medbit", "it") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|medbit| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|it| +|Size:|409.2 MB| + +## References + +https://huggingface.co/IVN-RIN/medBIT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-medbit_r3_plus_it.md b/docs/_posts/ahmedlone127/2023-09-14-medbit_r3_plus_it.md new file mode 100644 index 00000000000000..21c65a8b1c6da2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-medbit_r3_plus_it.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Italian medbit_r3_plus BertEmbeddings from IVN-RIN +author: John Snow Labs +name: medbit_r3_plus +date: 2023-09-14 +tags: [bert, it, open_source, fill_mask, onnx] +task: Embeddings +language: it +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`medbit_r3_plus` is a Italian model originally trained by IVN-RIN. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/medbit_r3_plus_it_5.1.1_3.0_1694655730518.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/medbit_r3_plus_it_5.1.1_3.0_1694655730518.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("medbit_r3_plus","it") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("medbit_r3_plus", "it") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|medbit_r3_plus| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|it| +|Size:|409.2 MB| + +## References + +https://huggingface.co/IVN-RIN/medBIT-r3-plus \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-medium_mlm_tweet_en.md b/docs/_posts/ahmedlone127/2023-09-14-medium_mlm_tweet_en.md new file mode 100644 index 00000000000000..29d7db54fcfc96 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-medium_mlm_tweet_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English medium_mlm_tweet BertEmbeddings from muhtasham +author: John Snow Labs +name: medium_mlm_tweet +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`medium_mlm_tweet` is a English model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/medium_mlm_tweet_en_5.1.1_3.0_1694664476857.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/medium_mlm_tweet_en_5.1.1_3.0_1694664476857.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("medium_mlm_tweet","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("medium_mlm_tweet", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|medium_mlm_tweet| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|154.2 MB| + +## References + +https://huggingface.co/muhtasham/medium-mlm-tweet \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-medruberttiny2_ru.md b/docs/_posts/ahmedlone127/2023-09-14-medruberttiny2_ru.md new file mode 100644 index 00000000000000..5aefa2d60c1af0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-medruberttiny2_ru.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Russian medruberttiny2 BertEmbeddings from DmitryPogrebnoy +author: John Snow Labs +name: medruberttiny2 +date: 2023-09-14 +tags: [bert, ru, open_source, fill_mask, onnx] +task: Embeddings +language: ru +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`medruberttiny2` is a Russian model originally trained by DmitryPogrebnoy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/medruberttiny2_ru_5.1.1_3.0_1694658844568.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/medruberttiny2_ru_5.1.1_3.0_1694658844568.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("medruberttiny2","ru") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("medruberttiny2", "ru") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|medruberttiny2| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|ru| +|Size:|109.1 MB| + +## References + +https://huggingface.co/DmitryPogrebnoy/MedRuBertTiny2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-mergedistill_base_cased_anneal_en.md b/docs/_posts/ahmedlone127/2023-09-14-mergedistill_base_cased_anneal_en.md new file mode 100644 index 00000000000000..e7cccffcb1d689 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-mergedistill_base_cased_anneal_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English mergedistill_base_cased_anneal BertEmbeddings from amitness +author: John Snow Labs +name: mergedistill_base_cased_anneal +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mergedistill_base_cased_anneal` is a English model originally trained by amitness. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mergedistill_base_cased_anneal_en_5.1.1_3.0_1694655583062.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mergedistill_base_cased_anneal_en_5.1.1_3.0_1694655583062.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("mergedistill_base_cased_anneal","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("mergedistill_base_cased_anneal", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mergedistill_base_cased_anneal| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|538.7 MB| + +## References + +https://huggingface.co/amitness/mergedistill-base-cased-anneal \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-mergedistill_base_cased_anneal_v4_en.md b/docs/_posts/ahmedlone127/2023-09-14-mergedistill_base_cased_anneal_v4_en.md new file mode 100644 index 00000000000000..db87564f46d8d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-mergedistill_base_cased_anneal_v4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English mergedistill_base_cased_anneal_v4 BertEmbeddings from amitness +author: John Snow Labs +name: mergedistill_base_cased_anneal_v4 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mergedistill_base_cased_anneal_v4` is a English model originally trained by amitness. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mergedistill_base_cased_anneal_v4_en_5.1.1_3.0_1694658171035.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mergedistill_base_cased_anneal_v4_en_5.1.1_3.0_1694658171035.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("mergedistill_base_cased_anneal_v4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("mergedistill_base_cased_anneal_v4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mergedistill_base_cased_anneal_v4| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|539.3 MB| + +## References + +https://huggingface.co/amitness/mergedistill-base-cased-anneal-v4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-mini_mlm_imdb_en.md b/docs/_posts/ahmedlone127/2023-09-14-mini_mlm_imdb_en.md new file mode 100644 index 00000000000000..08aa73b0b2ed25 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-mini_mlm_imdb_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English mini_mlm_imdb BertEmbeddings from muhtasham +author: John Snow Labs +name: mini_mlm_imdb +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mini_mlm_imdb` is a English model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mini_mlm_imdb_en_5.1.1_3.0_1694664315374.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mini_mlm_imdb_en_5.1.1_3.0_1694664315374.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("mini_mlm_imdb","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("mini_mlm_imdb", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mini_mlm_imdb| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|41.9 MB| + +## References + +https://huggingface.co/muhtasham/mini-mlm-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-mini_mlm_tweet_en.md b/docs/_posts/ahmedlone127/2023-09-14-mini_mlm_tweet_en.md new file mode 100644 index 00000000000000..6d36dd181e8794 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-mini_mlm_tweet_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English mini_mlm_tweet BertEmbeddings from muhtasham +author: John Snow Labs +name: mini_mlm_tweet +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mini_mlm_tweet` is a English model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mini_mlm_tweet_en_5.1.1_3.0_1694663987603.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mini_mlm_tweet_en_5.1.1_3.0_1694663987603.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("mini_mlm_tweet","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("mini_mlm_tweet", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mini_mlm_tweet| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|41.8 MB| + +## References + +https://huggingface.co/muhtasham/mini-mlm-tweet \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-mlm_20230416_003_1_en.md b/docs/_posts/ahmedlone127/2023-09-14-mlm_20230416_003_1_en.md new file mode 100644 index 00000000000000..564cc07291fc77 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-mlm_20230416_003_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English mlm_20230416_003_1 BertEmbeddings from intanm +author: John Snow Labs +name: mlm_20230416_003_1 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mlm_20230416_003_1` is a English model originally trained by intanm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mlm_20230416_003_1_en_5.1.1_3.0_1694657044354.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mlm_20230416_003_1_en_5.1.1_3.0_1694657044354.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("mlm_20230416_003_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("mlm_20230416_003_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mlm_20230416_003_1| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|464.2 MB| + +## References + +https://huggingface.co/intanm/mlm-20230416-003-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-mlm_20230416_003_2_en.md b/docs/_posts/ahmedlone127/2023-09-14-mlm_20230416_003_2_en.md new file mode 100644 index 00000000000000..32daa3fdf5ecd3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-mlm_20230416_003_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English mlm_20230416_003_2 BertEmbeddings from intanm +author: John Snow Labs +name: mlm_20230416_003_2 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mlm_20230416_003_2` is a English model originally trained by intanm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mlm_20230416_003_2_en_5.1.1_3.0_1694658343177.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mlm_20230416_003_2_en_5.1.1_3.0_1694658343177.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("mlm_20230416_003_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("mlm_20230416_003_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mlm_20230416_003_2| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|464.2 MB| + +## References + +https://huggingface.co/intanm/mlm-20230416-003-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-mlperf_inference_bert_pytorch_fp32_squad_v1.1_en.md b/docs/_posts/ahmedlone127/2023-09-14-mlperf_inference_bert_pytorch_fp32_squad_v1.1_en.md new file mode 100644 index 00000000000000..5220b152bfa159 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-mlperf_inference_bert_pytorch_fp32_squad_v1.1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English mlperf_inference_bert_pytorch_fp32_squad_v1.1 BertEmbeddings from cknowledge +author: John Snow Labs +name: mlperf_inference_bert_pytorch_fp32_squad_v1.1 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mlperf_inference_bert_pytorch_fp32_squad_v1.1` is a English model originally trained by cknowledge. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mlperf_inference_bert_pytorch_fp32_squad_v1.1_en_5.1.1_3.0_1694660332701.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mlperf_inference_bert_pytorch_fp32_squad_v1.1_en_5.1.1_3.0_1694660332701.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("mlperf_inference_bert_pytorch_fp32_squad_v1.1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("mlperf_inference_bert_pytorch_fp32_squad_v1.1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mlperf_inference_bert_pytorch_fp32_squad_v1.1| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/cknowledge/mlperf-inference-bert-pytorch-fp32-squad-v1.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-model_imdb_finetuned_en.md b/docs/_posts/ahmedlone127/2023-09-14-model_imdb_finetuned_en.md new file mode 100644 index 00000000000000..fb86215e144fba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-model_imdb_finetuned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English model_imdb_finetuned BertEmbeddings from phanidhar +author: John Snow Labs +name: model_imdb_finetuned +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`model_imdb_finetuned` is a English model originally trained by phanidhar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/model_imdb_finetuned_en_5.1.1_3.0_1694662202940.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/model_imdb_finetuned_en_5.1.1_3.0_1694662202940.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("model_imdb_finetuned","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("model_imdb_finetuned", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|model_imdb_finetuned| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/phanidhar/model-imdb-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-model_saeid7776_en.md b/docs/_posts/ahmedlone127/2023-09-14-model_saeid7776_en.md new file mode 100644 index 00000000000000..686979f1e283ab --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-model_saeid7776_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English model_saeid7776 BertEmbeddings from saeid7776 +author: John Snow Labs +name: model_saeid7776 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`model_saeid7776` is a English model originally trained by saeid7776. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/model_saeid7776_en_5.1.1_3.0_1694665486872.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/model_saeid7776_en_5.1.1_3.0_1694665486872.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("model_saeid7776","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("model_saeid7776", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|model_saeid7776| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|672.3 MB| + +## References + +https://huggingface.co/saeid7776/model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-model_v02_en.md b/docs/_posts/ahmedlone127/2023-09-14-model_v02_en.md new file mode 100644 index 00000000000000..c54d628bbc45d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-model_v02_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English model_v02 BertEmbeddings from saeid7776 +author: John Snow Labs +name: model_v02 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`model_v02` is a English model originally trained by saeid7776. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/model_v02_en_5.1.1_3.0_1694665653448.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/model_v02_en_5.1.1_3.0_1694665653448.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("model_v02","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("model_v02", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|model_v02| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|672.8 MB| + +## References + +https://huggingface.co/saeid7776/model_v02 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-muril_base_cased_en.md b/docs/_posts/ahmedlone127/2023-09-14-muril_base_cased_en.md new file mode 100644 index 00000000000000..4a9eb0904bf4fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-muril_base_cased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English muril_base_cased BertEmbeddings from google +author: John Snow Labs +name: muril_base_cased +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`muril_base_cased` is a English model originally trained by google. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/muril_base_cased_en_5.1.1_3.0_1694651581808.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/muril_base_cased_en_5.1.1_3.0_1694651581808.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("muril_base_cased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("muril_base_cased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|muril_base_cased| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|890.4 MB| + +## References + +https://huggingface.co/google/muril-base-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-mvr_squad_bert_base_multilingual_cased_xx.md b/docs/_posts/ahmedlone127/2023-09-14-mvr_squad_bert_base_multilingual_cased_xx.md new file mode 100644 index 00000000000000..6f71bf2d7109eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-mvr_squad_bert_base_multilingual_cased_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual mvr_squad_bert_base_multilingual_cased BertEmbeddings from dyyyyyyyy +author: John Snow Labs +name: mvr_squad_bert_base_multilingual_cased +date: 2023-09-14 +tags: [bert, xx, open_source, fill_mask, onnx] +task: Embeddings +language: xx +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mvr_squad_bert_base_multilingual_cased` is a Multilingual model originally trained by dyyyyyyyy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mvr_squad_bert_base_multilingual_cased_xx_5.1.1_3.0_1694658258601.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mvr_squad_bert_base_multilingual_cased_xx_5.1.1_3.0_1694658258601.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("mvr_squad_bert_base_multilingual_cased","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("mvr_squad_bert_base_multilingual_cased", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mvr_squad_bert_base_multilingual_cased| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|xx| +|Size:|665.0 MB| + +## References + +https://huggingface.co/dyyyyyyyy/MVR_squad_BERT-base-multilingual-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-mymodel_en.md b/docs/_posts/ahmedlone127/2023-09-14-mymodel_en.md new file mode 100644 index 00000000000000..d47053633850ad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-mymodel_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English mymodel BertEmbeddings from heima +author: John Snow Labs +name: mymodel +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mymodel` is a English model originally trained by heima. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mymodel_en_5.1.1_3.0_1694655891652.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mymodel_en_5.1.1_3.0_1694655891652.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("mymodel","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("mymodel", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mymodel| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/heima/mymodel \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-oyo_bert_base_yo.md b/docs/_posts/ahmedlone127/2023-09-14-oyo_bert_base_yo.md new file mode 100644 index 00000000000000..a318f643294551 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-oyo_bert_base_yo.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Yoruba oyo_bert_base BertEmbeddings from Davlan +author: John Snow Labs +name: oyo_bert_base +date: 2023-09-14 +tags: [bert, yo, open_source, fill_mask, onnx] +task: Embeddings +language: yo +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`oyo_bert_base` is a Yoruba model originally trained by Davlan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/oyo_bert_base_yo_5.1.1_3.0_1694663927219.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/oyo_bert_base_yo_5.1.1_3.0_1694663927219.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("oyo_bert_base","yo") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("oyo_bert_base", "yo") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|oyo_bert_base| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|yo| +|Size:|412.5 MB| + +## References + +https://huggingface.co/Davlan/oyo-bert-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-parlbert_german_law_de.md b/docs/_posts/ahmedlone127/2023-09-14-parlbert_german_law_de.md new file mode 100644 index 00000000000000..f50623989eecdf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-parlbert_german_law_de.md @@ -0,0 +1,93 @@ +--- +layout: model +title: German parlbert_german_law BertEmbeddings from InfAI +author: John Snow Labs +name: parlbert_german_law +date: 2023-09-14 +tags: [bert, de, open_source, fill_mask, onnx] +task: Embeddings +language: de +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`parlbert_german_law` is a German model originally trained by InfAI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/parlbert_german_law_de_5.1.1_3.0_1694667648647.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/parlbert_german_law_de_5.1.1_3.0_1694667648647.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("parlbert_german_law","de") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("parlbert_german_law", "de") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|parlbert_german_law| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|de| +|Size:|406.8 MB| + +## References + +https://huggingface.co/InfAI/parlbert-german-law \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-project3_model_en.md b/docs/_posts/ahmedlone127/2023-09-14-project3_model_en.md new file mode 100644 index 00000000000000..3b96943457b19b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-project3_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English project3_model BertEmbeddings from nithya +author: John Snow Labs +name: project3_model +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`project3_model` is a English model originally trained by nithya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/project3_model_en_5.1.1_3.0_1694662572357.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/project3_model_en_5.1.1_3.0_1694662572357.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("project3_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("project3_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|project3_model| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/nithya/project3-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-public_models_en.md b/docs/_posts/ahmedlone127/2023-09-14-public_models_en.md new file mode 100644 index 00000000000000..0c01d10eb05cd9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-public_models_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English public_models BertEmbeddings from helloNet +author: John Snow Labs +name: public_models +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`public_models` is a English model originally trained by helloNet. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/public_models_en_5.1.1_3.0_1694656362220.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/public_models_en_5.1.1_3.0_1694656362220.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("public_models","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("public_models", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|public_models| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/helloNet/public_models \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-radbert_en.md b/docs/_posts/ahmedlone127/2023-09-14-radbert_en.md new file mode 100644 index 00000000000000..48d8e85eaec6d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-radbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English radbert BertEmbeddings from StanfordAIMI +author: John Snow Labs +name: radbert +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`radbert` is a English model originally trained by StanfordAIMI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/radbert_en_5.1.1_3.0_1694656544468.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/radbert_en_5.1.1_3.0_1694656544468.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("radbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("radbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|radbert| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|402.5 MB| + +## References + +https://huggingface.co/StanfordAIMI/RadBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-retromae_msmarco_distill_en.md b/docs/_posts/ahmedlone127/2023-09-14-retromae_msmarco_distill_en.md new file mode 100644 index 00000000000000..c6bb0e888a5a06 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-retromae_msmarco_distill_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English retromae_msmarco_distill BertEmbeddings from Shitao +author: John Snow Labs +name: retromae_msmarco_distill +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`retromae_msmarco_distill` is a English model originally trained by Shitao. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/retromae_msmarco_distill_en_5.1.1_3.0_1694650755532.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/retromae_msmarco_distill_en_5.1.1_3.0_1694650755532.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("retromae_msmarco_distill","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("retromae_msmarco_distill", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|retromae_msmarco_distill| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.7 MB| + +## References + +https://huggingface.co/Shitao/RetroMAE_MSMARCO_distill \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-sagorbert_nwp_finetuning_test2_en.md b/docs/_posts/ahmedlone127/2023-09-14-sagorbert_nwp_finetuning_test2_en.md new file mode 100644 index 00000000000000..8445a53a13997a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-sagorbert_nwp_finetuning_test2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English sagorbert_nwp_finetuning_test2 BertEmbeddings from amirhamza11 +author: John Snow Labs +name: sagorbert_nwp_finetuning_test2 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sagorbert_nwp_finetuning_test2` is a English model originally trained by amirhamza11. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sagorbert_nwp_finetuning_test2_en_5.1.1_3.0_1694659149656.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sagorbert_nwp_finetuning_test2_en_5.1.1_3.0_1694659149656.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("sagorbert_nwp_finetuning_test2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("sagorbert_nwp_finetuning_test2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sagorbert_nwp_finetuning_test2| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|614.8 MB| + +## References + +https://huggingface.co/amirhamza11/sagorbert_nwp_finetuning_test2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-sagorbert_nwp_finetuning_test4_en.md b/docs/_posts/ahmedlone127/2023-09-14-sagorbert_nwp_finetuning_test4_en.md new file mode 100644 index 00000000000000..1a7e72dbe5b021 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-sagorbert_nwp_finetuning_test4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English sagorbert_nwp_finetuning_test4 BertEmbeddings from amirhamza11 +author: John Snow Labs +name: sagorbert_nwp_finetuning_test4 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sagorbert_nwp_finetuning_test4` is a English model originally trained by amirhamza11. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sagorbert_nwp_finetuning_test4_en_5.1.1_3.0_1694662734694.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sagorbert_nwp_finetuning_test4_en_5.1.1_3.0_1694662734694.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("sagorbert_nwp_finetuning_test4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("sagorbert_nwp_finetuning_test4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sagorbert_nwp_finetuning_test4| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|614.9 MB| + +## References + +https://huggingface.co/amirhamza11/sagorbert_nwp_finetuning_test4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-scholarbert_100_64bit_en.md b/docs/_posts/ahmedlone127/2023-09-14-scholarbert_100_64bit_en.md new file mode 100644 index 00000000000000..f8902d9a65e07a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-scholarbert_100_64bit_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English scholarbert_100_64bit BertEmbeddings from globuslabs +author: John Snow Labs +name: scholarbert_100_64bit +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scholarbert_100_64bit` is a English model originally trained by globuslabs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scholarbert_100_64bit_en_5.1.1_3.0_1694668681788.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scholarbert_100_64bit_en_5.1.1_3.0_1694668681788.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("scholarbert_100_64bit","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("scholarbert_100_64bit", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scholarbert_100_64bit| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|842.4 MB| + +## References + +https://huggingface.co/globuslabs/ScholarBERT_100_64bit \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-skc_mlm_german_torch_de.md b/docs/_posts/ahmedlone127/2023-09-14-skc_mlm_german_torch_de.md new file mode 100644 index 00000000000000..ffb89c33cc68f3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-skc_mlm_german_torch_de.md @@ -0,0 +1,93 @@ +--- +layout: model +title: German skc_mlm_german_torch BertEmbeddings from Tobias +author: John Snow Labs +name: skc_mlm_german_torch +date: 2023-09-14 +tags: [bert, de, open_source, fill_mask, onnx] +task: Embeddings +language: de +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`skc_mlm_german_torch` is a German model originally trained by Tobias. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/skc_mlm_german_torch_de_5.1.1_3.0_1694663119516.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/skc_mlm_german_torch_de_5.1.1_3.0_1694663119516.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("skc_mlm_german_torch","de") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("skc_mlm_german_torch", "de") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|skc_mlm_german_torch| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|de| +|Size:|406.9 MB| + +## References + +https://huggingface.co/Tobias/skc_MLM_German_torch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-small_mlm_imdb_en.md b/docs/_posts/ahmedlone127/2023-09-14-small_mlm_imdb_en.md new file mode 100644 index 00000000000000..08a988389d73b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-small_mlm_imdb_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English small_mlm_imdb BertEmbeddings from muhtasham +author: John Snow Labs +name: small_mlm_imdb +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`small_mlm_imdb` is a English model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/small_mlm_imdb_en_5.1.1_3.0_1694664931729.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/small_mlm_imdb_en_5.1.1_3.0_1694664931729.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("small_mlm_imdb","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("small_mlm_imdb", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|small_mlm_imdb| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|107.0 MB| + +## References + +https://huggingface.co/muhtasham/small-mlm-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-small_mlm_tweet_en.md b/docs/_posts/ahmedlone127/2023-09-14-small_mlm_tweet_en.md new file mode 100644 index 00000000000000..e354adee2a69c3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-small_mlm_tweet_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English small_mlm_tweet BertEmbeddings from muhtasham +author: John Snow Labs +name: small_mlm_tweet +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`small_mlm_tweet` is a English model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/small_mlm_tweet_en_5.1.1_3.0_1694664219703.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/small_mlm_tweet_en_5.1.1_3.0_1694664219703.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("small_mlm_tweet","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("small_mlm_tweet", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|small_mlm_tweet| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|106.9 MB| + +## References + +https://huggingface.co/muhtasham/small-mlm-tweet \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-splade_cocondenser_ensembledistil_en.md b/docs/_posts/ahmedlone127/2023-09-14-splade_cocondenser_ensembledistil_en.md new file mode 100644 index 00000000000000..0f067aeb8edb23 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-splade_cocondenser_ensembledistil_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English splade_cocondenser_ensembledistil BertEmbeddings from naver +author: John Snow Labs +name: splade_cocondenser_ensembledistil +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`splade_cocondenser_ensembledistil` is a English model originally trained by naver. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/splade_cocondenser_ensembledistil_en_5.1.1_3.0_1694661773914.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/splade_cocondenser_ensembledistil_en_5.1.1_3.0_1694661773914.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("splade_cocondenser_ensembledistil","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("splade_cocondenser_ensembledistil", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|splade_cocondenser_ensembledistil| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.4 MB| + +## References + +https://huggingface.co/naver/splade-cocondenser-ensembledistil \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-splade_cocondenser_selfdistil_naver_en.md b/docs/_posts/ahmedlone127/2023-09-14-splade_cocondenser_selfdistil_naver_en.md new file mode 100644 index 00000000000000..837a77557b8ae5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-splade_cocondenser_selfdistil_naver_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English splade_cocondenser_selfdistil_naver BertEmbeddings from naver +author: John Snow Labs +name: splade_cocondenser_selfdistil_naver +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`splade_cocondenser_selfdistil_naver` is a English model originally trained by naver. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/splade_cocondenser_selfdistil_naver_en_5.1.1_3.0_1694661508394.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/splade_cocondenser_selfdistil_naver_en_5.1.1_3.0_1694661508394.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("splade_cocondenser_selfdistil_naver","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("splade_cocondenser_selfdistil_naver", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|splade_cocondenser_selfdistil_naver| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.4 MB| + +## References + +https://huggingface.co/naver/splade-cocondenser-selfdistil \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-telugu_bert_scratch_te.md b/docs/_posts/ahmedlone127/2023-09-14-telugu_bert_scratch_te.md new file mode 100644 index 00000000000000..4aa47d4b618289 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-telugu_bert_scratch_te.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Telugu telugu_bert_scratch BertEmbeddings from l3cube-pune +author: John Snow Labs +name: telugu_bert_scratch +date: 2023-09-14 +tags: [bert, te, open_source, fill_mask, onnx] +task: Embeddings +language: te +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`telugu_bert_scratch` is a Telugu model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/telugu_bert_scratch_te_5.1.1_3.0_1694651570626.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/telugu_bert_scratch_te_5.1.1_3.0_1694651570626.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("telugu_bert_scratch","te") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("telugu_bert_scratch", "te") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|telugu_bert_scratch| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|te| +|Size:|470.5 MB| + +## References + +https://huggingface.co/l3cube-pune/telugu-bert-scratch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-test_bert_base_spanish_wwm_cased_finetuned_ultrasounds_en.md b/docs/_posts/ahmedlone127/2023-09-14-test_bert_base_spanish_wwm_cased_finetuned_ultrasounds_en.md new file mode 100644 index 00000000000000..a8a9587cb1dd00 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-test_bert_base_spanish_wwm_cased_finetuned_ultrasounds_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English test_bert_base_spanish_wwm_cased_finetuned_ultrasounds BertEmbeddings from manucos +author: John Snow Labs +name: test_bert_base_spanish_wwm_cased_finetuned_ultrasounds +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_bert_base_spanish_wwm_cased_finetuned_ultrasounds` is a English model originally trained by manucos. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_bert_base_spanish_wwm_cased_finetuned_ultrasounds_en_5.1.1_3.0_1694663393018.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_bert_base_spanish_wwm_cased_finetuned_ultrasounds_en_5.1.1_3.0_1694663393018.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("test_bert_base_spanish_wwm_cased_finetuned_ultrasounds","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("test_bert_base_spanish_wwm_cased_finetuned_ultrasounds", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_bert_base_spanish_wwm_cased_finetuned_ultrasounds| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/manucos/test-bert-base-spanish-wwm-cased-finetuned-ultrasounds \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-test_bert_base_uncased_en.md b/docs/_posts/ahmedlone127/2023-09-14-test_bert_base_uncased_en.md new file mode 100644 index 00000000000000..ca35916c7f1021 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-test_bert_base_uncased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English test_bert_base_uncased BertEmbeddings from kkkzzzkkk +author: John Snow Labs +name: test_bert_base_uncased +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_bert_base_uncased` is a English model originally trained by kkkzzzkkk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_bert_base_uncased_en_5.1.1_3.0_1694663392938.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_bert_base_uncased_en_5.1.1_3.0_1694663392938.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("test_bert_base_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("test_bert_base_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_bert_base_uncased| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/kkkzzzkkk/test_bert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-test_dushen_en.md b/docs/_posts/ahmedlone127/2023-09-14-test_dushen_en.md new file mode 100644 index 00000000000000..80ab3cd98bf81c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-test_dushen_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English test_dushen BertEmbeddings from dushen +author: John Snow Labs +name: test_dushen +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_dushen` is a English model originally trained by dushen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_dushen_en_5.1.1_3.0_1694653639422.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_dushen_en_5.1.1_3.0_1694653639422.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("test_dushen","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("test_dushen", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_dushen| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/dushen/test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-tiny_mlm_imdb_en.md b/docs/_posts/ahmedlone127/2023-09-14-tiny_mlm_imdb_en.md new file mode 100644 index 00000000000000..a95b493f28606b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-tiny_mlm_imdb_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tiny_mlm_imdb BertEmbeddings from muhtasham +author: John Snow Labs +name: tiny_mlm_imdb +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_mlm_imdb` is a English model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_mlm_imdb_en_5.1.1_3.0_1694663665545.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_mlm_imdb_en_5.1.1_3.0_1694663665545.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("tiny_mlm_imdb","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("tiny_mlm_imdb", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_mlm_imdb| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/muhtasham/tiny-mlm-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-tiny_mlm_tweet_en.md b/docs/_posts/ahmedlone127/2023-09-14-tiny_mlm_tweet_en.md new file mode 100644 index 00000000000000..7ab71d2936ee20 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-tiny_mlm_tweet_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tiny_mlm_tweet BertEmbeddings from muhtasham +author: John Snow Labs +name: tiny_mlm_tweet +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_mlm_tweet` is a English model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_mlm_tweet_en_5.1.1_3.0_1694663791725.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_mlm_tweet_en_5.1.1_3.0_1694663791725.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("tiny_mlm_tweet","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("tiny_mlm_tweet", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_mlm_tweet| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/muhtasham/tiny-mlm-tweet \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-ucb_bert_finetunned_en.md b/docs/_posts/ahmedlone127/2023-09-14-ucb_bert_finetunned_en.md new file mode 100644 index 00000000000000..2ab6c71bbc765b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-ucb_bert_finetunned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ucb_bert_finetunned BertEmbeddings from Diegomejia +author: John Snow Labs +name: ucb_bert_finetunned +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ucb_bert_finetunned` is a English model originally trained by Diegomejia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ucb_bert_finetunned_en_5.1.1_3.0_1694661140033.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ucb_bert_finetunned_en_5.1.1_3.0_1694661140033.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("ucb_bert_finetunned","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("ucb_bert_finetunned", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ucb_bert_finetunned| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/Diegomejia/ucb-bert-finetunned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-vatestnew_en.md b/docs/_posts/ahmedlone127/2023-09-14-vatestnew_en.md new file mode 100644 index 00000000000000..6eaaf25ddb1f10 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-vatestnew_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English vatestnew BertEmbeddings from mtluczek80 +author: John Snow Labs +name: vatestnew +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`vatestnew` is a English model originally trained by mtluczek80. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/vatestnew_en_5.1.1_3.0_1694657391978.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/vatestnew_en_5.1.1_3.0_1694657391978.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("vatestnew","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("vatestnew", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|vatestnew| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/mtluczek80/VATestNew \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-vbert_2021_base_en.md b/docs/_posts/ahmedlone127/2023-09-14-vbert_2021_base_en.md new file mode 100644 index 00000000000000..9f8d9584a3a562 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-vbert_2021_base_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English vbert_2021_base BertEmbeddings from VMware +author: John Snow Labs +name: vbert_2021_base +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`vbert_2021_base` is a English model originally trained by VMware. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/vbert_2021_base_en_5.1.1_3.0_1694664971607.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/vbert_2021_base_en_5.1.1_3.0_1694664971607.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("vbert_2021_base","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("vbert_2021_base", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|vbert_2021_base| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/VMware/vbert-2021-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-weights_bert_mlm_epoch50_en.md b/docs/_posts/ahmedlone127/2023-09-14-weights_bert_mlm_epoch50_en.md new file mode 100644 index 00000000000000..d3405b6f4c8b2f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-weights_bert_mlm_epoch50_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English weights_bert_mlm_epoch50 BertEmbeddings from grumpy +author: John Snow Labs +name: weights_bert_mlm_epoch50 +date: 2023-09-14 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`weights_bert_mlm_epoch50` is a English model originally trained by grumpy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/weights_bert_mlm_epoch50_en_5.1.1_3.0_1694652096710.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/weights_bert_mlm_epoch50_en_5.1.1_3.0_1694652096710.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("weights_bert_mlm_epoch50","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("weights_bert_mlm_epoch50", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|weights_bert_mlm_epoch50| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/grumpy/weights_bert_mlm_epoch50 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-14-word_ethical_ko.md b/docs/_posts/ahmedlone127/2023-09-14-word_ethical_ko.md new file mode 100644 index 00000000000000..04b4d7f98c066d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-14-word_ethical_ko.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Korean word_ethical BertEmbeddings from julian5383 +author: John Snow Labs +name: word_ethical +date: 2023-09-14 +tags: [bert, ko, open_source, fill_mask, onnx] +task: Embeddings +language: ko +edition: Spark NLP 5.1.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`word_ethical` is a Korean model originally trained by julian5383. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/word_ethical_ko_5.1.1_3.0_1694670073884.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/word_ethical_ko_5.1.1_3.0_1694670073884.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("word_ethical","ko") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("word_ethical", "ko") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|word_ethical| +|Compatibility:|Spark NLP 5.1.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|ko| +|Size:|421.2 MB| + +## References + +https://huggingface.co/julian5383/word_ethical \ No newline at end of file