diff --git a/docs/_posts/ahmedlone127/2024-09-07-somd_xlm_stage1_v2_en.md b/docs/_posts/ahmedlone127/2024-09-07-somd_xlm_stage1_v2_en.md new file mode 100644 index 00000000000000..ea03555101ad17 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-07-somd_xlm_stage1_v2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English somd_xlm_stage1_v2 XlmRoBertaForTokenClassification from ThuyNT03 +author: John Snow Labs +name: somd_xlm_stage1_v2 +date: 2024-09-07 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`somd_xlm_stage1_v2` is a English model originally trained by ThuyNT03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/somd_xlm_stage1_v2_en_5.5.0_3.0_1725687603645.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/somd_xlm_stage1_v2_en_5.5.0_3.0_1725687603645.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("somd_xlm_stage1_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("somd_xlm_stage1_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|somd_xlm_stage1_v2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|797.6 MB| + +## References + +https://huggingface.co/ThuyNT03/SOMD-xlm-stage1-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-09-cot_ep3_42_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-09-cot_ep3_42_pipeline_en.md new file mode 100644 index 00000000000000..bed5a826ef30ab --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-09-cot_ep3_42_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English cot_ep3_42_pipeline pipeline MPNetEmbeddings from ingeol +author: John Snow Labs +name: cot_ep3_42_pipeline +date: 2024-09-09 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cot_ep3_42_pipeline` is a English model originally trained by ingeol. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cot_ep3_42_pipeline_en_5.5.0_3.0_1725897373617.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cot_ep3_42_pipeline_en_5.5.0_3.0_1725897373617.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("cot_ep3_42_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("cot_ep3_42_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cot_ep3_42_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ingeol/cot_ep3_42 + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-11-action_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-11-action_pipeline_en.md new file mode 100644 index 00000000000000..72a3fe7b854680 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-11-action_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English action_pipeline pipeline DistilBertForSequenceClassification from SergeyTW +author: John Snow Labs +name: action_pipeline +date: 2024-09-11 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`action_pipeline` is a English model originally trained by SergeyTW. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/action_pipeline_en_5.5.0_3.0_1726014444476.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/action_pipeline_en_5.5.0_3.0_1726014444476.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("action_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("action_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|action_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SergeyTW/action + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-11-finetuned_mixed_2epochs_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-11-finetuned_mixed_2epochs_pipeline_en.md new file mode 100644 index 00000000000000..175783c6cff77c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-11-finetuned_mixed_2epochs_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English finetuned_mixed_2epochs_pipeline pipeline MPNetEmbeddings from jhsmith +author: John Snow Labs +name: finetuned_mixed_2epochs_pipeline +date: 2024-09-11 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_mixed_2epochs_pipeline` is a English model originally trained by jhsmith. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_mixed_2epochs_pipeline_en_5.5.0_3.0_1726054541841.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_mixed_2epochs_pipeline_en_5.5.0_3.0_1726054541841.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("finetuned_mixed_2epochs_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("finetuned_mixed_2epochs_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_mixed_2epochs_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.4 MB| + +## References + +https://huggingface.co/jhsmith/finetuned_mixed_2epochs + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-11-uned_tfg_08_77_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-11-uned_tfg_08_77_pipeline_en.md new file mode 100644 index 00000000000000..9f9400dd5c2eb2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-11-uned_tfg_08_77_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English uned_tfg_08_77_pipeline pipeline RoBertaForSequenceClassification from alexisdr +author: John Snow Labs +name: uned_tfg_08_77_pipeline +date: 2024-09-11 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`uned_tfg_08_77_pipeline` is a English model originally trained by alexisdr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/uned_tfg_08_77_pipeline_en_5.5.0_3.0_1726090867835.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/uned_tfg_08_77_pipeline_en_5.5.0_3.0_1726090867835.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("uned_tfg_08_77_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("uned_tfg_08_77_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|uned_tfg_08_77_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|443.8 MB| + +## References + +https://huggingface.co/alexisdr/uned-tfg-08.77 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-12-akan_3_3000ms_pipeline_ak.md b/docs/_posts/ahmedlone127/2024-09-12-akan_3_3000ms_pipeline_ak.md new file mode 100644 index 00000000000000..0e5a2d0a865b57 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-12-akan_3_3000ms_pipeline_ak.md @@ -0,0 +1,69 @@ +--- +layout: model +title: Akan akan_3_3000ms_pipeline pipeline WhisperForCTC from devkyle +author: John Snow Labs +name: akan_3_3000ms_pipeline +date: 2024-09-12 +tags: [ak, open_source, pipeline, onnx] +task: Automatic Speech Recognition +language: ak +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`akan_3_3000ms_pipeline` is a Akan model originally trained by devkyle. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/akan_3_3000ms_pipeline_ak_5.5.0_3.0_1726151442866.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/akan_3_3000ms_pipeline_ak_5.5.0_3.0_1726151442866.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("akan_3_3000ms_pipeline", lang = "ak") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("akan_3_3000ms_pipeline", lang = "ak") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|akan_3_3000ms_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|ak| +|Size:|389.6 MB| + +## References + +https://huggingface.co/devkyle/Akan-3-3000ms + +## Included Models + +- AudioAssembler +- WhisperForCTC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-12-xlm_roberta_base_finetuned_panx_german_jnrahul92_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-12-xlm_roberta_base_finetuned_panx_german_jnrahul92_pipeline_en.md new file mode 100644 index 00000000000000..a3f9b59adba16c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-12-xlm_roberta_base_finetuned_panx_german_jnrahul92_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_jnrahul92_pipeline pipeline XlmRoBertaForTokenClassification from jnrahul92 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_jnrahul92_pipeline +date: 2024-09-12 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_jnrahul92_pipeline` is a English model originally trained by jnrahul92. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_jnrahul92_pipeline_en_5.5.0_3.0_1726115403155.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_jnrahul92_pipeline_en_5.5.0_3.0_1726115403155.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_jnrahul92_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_jnrahul92_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_jnrahul92_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|840.8 MB| + +## References + +https://huggingface.co/jnrahul92/xlm-roberta-base-finetuned_panx-de + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-14-arwiki_20230101_roberta_mlm_bots_ar.md b/docs/_posts/ahmedlone127/2024-09-14-arwiki_20230101_roberta_mlm_bots_ar.md new file mode 100644 index 00000000000000..3531139f63f8e1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-14-arwiki_20230101_roberta_mlm_bots_ar.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Arabic arwiki_20230101_roberta_mlm_bots RoBertaEmbeddings from SaiedAlshahrani +author: John Snow Labs +name: arwiki_20230101_roberta_mlm_bots +date: 2024-09-14 +tags: [ar, open_source, onnx, embeddings, roberta] +task: Embeddings +language: ar +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`arwiki_20230101_roberta_mlm_bots` is a Arabic model originally trained by SaiedAlshahrani. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/arwiki_20230101_roberta_mlm_bots_ar_5.5.0_3.0_1726338709197.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/arwiki_20230101_roberta_mlm_bots_ar_5.5.0_3.0_1726338709197.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = RoBertaEmbeddings.pretrained("arwiki_20230101_roberta_mlm_bots","ar") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = RoBertaEmbeddings.pretrained("arwiki_20230101_roberta_mlm_bots","ar") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|arwiki_20230101_roberta_mlm_bots| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[roberta]| +|Language:|ar| +|Size:|311.7 MB| + +## References + +https://huggingface.co/SaiedAlshahrani/arwiki_20230101_roberta_mlm_bots \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-15-prompt_injection_bert_en.md b/docs/_posts/ahmedlone127/2024-09-15-prompt_injection_bert_en.md new file mode 100644 index 00000000000000..31911ea623eef6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-15-prompt_injection_bert_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English prompt_injection_bert DistilBertForSequenceClassification from yashika-03 +author: John Snow Labs +name: prompt_injection_bert +date: 2024-09-15 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`prompt_injection_bert` is a English model originally trained by yashika-03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/prompt_injection_bert_en_5.5.0_3.0_1726365714562.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/prompt_injection_bert_en_5.5.0_3.0_1726365714562.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("prompt_injection_bert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("prompt_injection_bert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|prompt_injection_bert| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/yashika-03/prompt-injection-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-15-test_model_jockerli_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-15-test_model_jockerli_pipeline_en.md new file mode 100644 index 00000000000000..c27fa99ba730b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-15-test_model_jockerli_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English test_model_jockerli_pipeline pipeline DistilBertForSequenceClassification from JockerLi +author: John Snow Labs +name: test_model_jockerli_pipeline +date: 2024-09-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_model_jockerli_pipeline` is a English model originally trained by JockerLi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_model_jockerli_pipeline_en_5.5.0_3.0_1726365842783.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_model_jockerli_pipeline_en_5.5.0_3.0_1726365842783.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("test_model_jockerli_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("test_model_jockerli_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_model_jockerli_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/JockerLi/test_model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-15-whisper_small_hindi_kshitizkhandelwal_tr.md b/docs/_posts/ahmedlone127/2024-09-15-whisper_small_hindi_kshitizkhandelwal_tr.md new file mode 100644 index 00000000000000..479c03d6f29d94 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-15-whisper_small_hindi_kshitizkhandelwal_tr.md @@ -0,0 +1,84 @@ +--- +layout: model +title: Turkish whisper_small_hindi_kshitizkhandelwal WhisperForCTC from Kshitizkhandelwal +author: John Snow Labs +name: whisper_small_hindi_kshitizkhandelwal +date: 2024-09-15 +tags: [tr, open_source, onnx, asr, whisper] +task: Automatic Speech Recognition +language: tr +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: WhisperForCTC +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`whisper_small_hindi_kshitizkhandelwal` is a Turkish model originally trained by Kshitizkhandelwal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/whisper_small_hindi_kshitizkhandelwal_tr_5.5.0_3.0_1726430520772.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/whisper_small_hindi_kshitizkhandelwal_tr_5.5.0_3.0_1726430520772.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +audioAssembler = AudioAssembler() \ + .setInputCol("audio_content") \ + .setOutputCol("audio_assembler") + +speechToText = WhisperForCTC.pretrained("whisper_small_hindi_kshitizkhandelwal","tr") \ + .setInputCols(["audio_assembler"]) \ + .setOutputCol("text") + +pipeline = Pipeline().setStages([audioAssembler, speechToText]) +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val audioAssembler = new DocumentAssembler() + .setInputCols("audio_content") + .setOutputCols("audio_assembler") + +val speechToText = WhisperForCTC.pretrained("whisper_small_hindi_kshitizkhandelwal", "tr") + .setInputCols(Array("audio_assembler")) + .setOutputCol("text") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, speechToText)) +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|whisper_small_hindi_kshitizkhandelwal| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[audio_assembler]| +|Output Labels:|[text]| +|Language:|tr| +|Size:|1.7 GB| + +## References + +https://huggingface.co/Kshitizkhandelwal/whisper-small-hi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-16-distilbert_base_uncased_finetuned_imdb_jjin6668_en.md b/docs/_posts/ahmedlone127/2024-09-16-distilbert_base_uncased_finetuned_imdb_jjin6668_en.md new file mode 100644 index 00000000000000..2ccb23579d110c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-16-distilbert_base_uncased_finetuned_imdb_jjin6668_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_jjin6668 DistilBertEmbeddings from jjin6668 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_jjin6668 +date: 2024-09-16 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_jjin6668` is a English model originally trained by jjin6668. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_jjin6668_en_5.5.0_3.0_1726472994936.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_jjin6668_en_5.5.0_3.0_1726472994936.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_jjin6668","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_jjin6668","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_jjin6668| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/jjin6668/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-16-distilbert_base_uncased_utility_zphr_0st_ut12ut1_plain_simsp_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-16-distilbert_base_uncased_utility_zphr_0st_ut12ut1_plain_simsp_pipeline_en.md new file mode 100644 index 00000000000000..65b166111363c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-16-distilbert_base_uncased_utility_zphr_0st_ut12ut1_plain_simsp_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_utility_zphr_0st_ut12ut1_plain_simsp_pipeline pipeline DistilBertForSequenceClassification from tom192180 +author: John Snow Labs +name: distilbert_base_uncased_utility_zphr_0st_ut12ut1_plain_simsp_pipeline +date: 2024-09-16 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_utility_zphr_0st_ut12ut1_plain_simsp_pipeline` is a English model originally trained by tom192180. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_utility_zphr_0st_ut12ut1_plain_simsp_pipeline_en_5.5.0_3.0_1726506626056.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_utility_zphr_0st_ut12ut1_plain_simsp_pipeline_en_5.5.0_3.0_1726506626056.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_utility_zphr_0st_ut12ut1_plain_simsp_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_utility_zphr_0st_ut12ut1_plain_simsp_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_utility_zphr_0st_ut12ut1_plain_simsp_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/tom192180/distilbert-base-uncased_utility_zphr_0st_ut12ut1_plain_simsp + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-16-finetuning_sentiment_model_3000_samples_sarathaer_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-16-finetuning_sentiment_model_3000_samples_sarathaer_pipeline_en.md new file mode 100644 index 00000000000000..bcdd4a0ac837f3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-16-finetuning_sentiment_model_3000_samples_sarathaer_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_sarathaer_pipeline pipeline DistilBertForSequenceClassification from Sarathaer +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_sarathaer_pipeline +date: 2024-09-16 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_sarathaer_pipeline` is a English model originally trained by Sarathaer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_sarathaer_pipeline_en_5.5.0_3.0_1726525637389.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_sarathaer_pipeline_en_5.5.0_3.0_1726525637389.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("finetuning_sentiment_model_3000_samples_sarathaer_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("finetuning_sentiment_model_3000_samples_sarathaer_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_sarathaer_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Sarathaer/finetuning-sentiment-model-3000-samples + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-16-opus_maltese_english_ganda_finetuned_english_tonga_tonga_islands_ganda_achuka_en.md b/docs/_posts/ahmedlone127/2024-09-16-opus_maltese_english_ganda_finetuned_english_tonga_tonga_islands_ganda_achuka_en.md new file mode 100644 index 00000000000000..42826c30ab4cc3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-16-opus_maltese_english_ganda_finetuned_english_tonga_tonga_islands_ganda_achuka_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English opus_maltese_english_ganda_finetuned_english_tonga_tonga_islands_ganda_achuka MarianTransformer from Achuka +author: John Snow Labs +name: opus_maltese_english_ganda_finetuned_english_tonga_tonga_islands_ganda_achuka +date: 2024-09-16 +tags: [en, open_source, onnx, translation, marian] +task: Translation +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MarianTransformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MarianTransformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`opus_maltese_english_ganda_finetuned_english_tonga_tonga_islands_ganda_achuka` is a English model originally trained by Achuka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/opus_maltese_english_ganda_finetuned_english_tonga_tonga_islands_ganda_achuka_en_5.5.0_3.0_1726457640414.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/opus_maltese_english_ganda_finetuned_english_tonga_tonga_islands_ganda_achuka_en_5.5.0_3.0_1726457640414.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("translation") + +marian = MarianTransformer.pretrained("opus_maltese_english_ganda_finetuned_english_tonga_tonga_islands_ganda_achuka","en") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, marian]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val marian = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = MarianTransformer.pretrained("opus_maltese_english_ganda_finetuned_english_tonga_tonga_islands_ganda_achuka","en") + .setInputCols(Array("sentence")) + .setOutputCol("translation") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, marian)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|opus_maltese_english_ganda_finetuned_english_tonga_tonga_islands_ganda_achuka| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentences]| +|Output Labels:|[translation]| +|Language:|en| +|Size:|513.2 MB| + +## References + +https://huggingface.co/Achuka/opus-mt-en-lg-finetuned-en-to-lg \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-16-whisper_tiny_smarthome_thai_pipeline_th.md b/docs/_posts/ahmedlone127/2024-09-16-whisper_tiny_smarthome_thai_pipeline_th.md new file mode 100644 index 00000000000000..a1e0cad8b85cbb --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-16-whisper_tiny_smarthome_thai_pipeline_th.md @@ -0,0 +1,69 @@ +--- +layout: model +title: Thai whisper_tiny_smarthome_thai_pipeline pipeline WhisperForCTC from Porameht +author: John Snow Labs +name: whisper_tiny_smarthome_thai_pipeline +date: 2024-09-16 +tags: [th, open_source, pipeline, onnx] +task: Automatic Speech Recognition +language: th +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`whisper_tiny_smarthome_thai_pipeline` is a Thai model originally trained by Porameht. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/whisper_tiny_smarthome_thai_pipeline_th_5.5.0_3.0_1726480512876.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/whisper_tiny_smarthome_thai_pipeline_th_5.5.0_3.0_1726480512876.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("whisper_tiny_smarthome_thai_pipeline", lang = "th") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("whisper_tiny_smarthome_thai_pipeline", lang = "th") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|whisper_tiny_smarthome_thai_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|th| +|Size:|389.8 MB| + +## References + +https://huggingface.co/Porameht/whisper-tiny-smarthome-thai + +## Included Models + +- AudioAssembler +- WhisperForCTC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-16-xlm_roberta_base_finetuned_panx_german_ryatora_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-16-xlm_roberta_base_finetuned_panx_german_ryatora_pipeline_en.md new file mode 100644 index 00000000000000..bd525451403cb1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-16-xlm_roberta_base_finetuned_panx_german_ryatora_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_ryatora_pipeline pipeline XlmRoBertaForTokenClassification from ryatora +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_ryatora_pipeline +date: 2024-09-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_ryatora_pipeline` is a English model originally trained by ryatora. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_ryatora_pipeline_en_5.5.0_3.0_1726497048057.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_ryatora_pipeline_en_5.5.0_3.0_1726497048057.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_ryatora_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_ryatora_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_ryatora_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|853.8 MB| + +## References + +https://huggingface.co/ryatora/xlm-roberta-base-finetuned-panx-de + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-17-bsc_bio_ehr_spanish_symptemist_word2vec_8_ner_en.md b/docs/_posts/ahmedlone127/2024-09-17-bsc_bio_ehr_spanish_symptemist_word2vec_8_ner_en.md new file mode 100644 index 00000000000000..bb85681a762b73 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-17-bsc_bio_ehr_spanish_symptemist_word2vec_8_ner_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bsc_bio_ehr_spanish_symptemist_word2vec_8_ner RoBertaForTokenClassification from Rodrigo1771 +author: John Snow Labs +name: bsc_bio_ehr_spanish_symptemist_word2vec_8_ner +date: 2024-09-17 +tags: [en, open_source, onnx, token_classification, roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bsc_bio_ehr_spanish_symptemist_word2vec_8_ner` is a English model originally trained by Rodrigo1771. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bsc_bio_ehr_spanish_symptemist_word2vec_8_ner_en_5.5.0_3.0_1726538122376.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bsc_bio_ehr_spanish_symptemist_word2vec_8_ner_en_5.5.0_3.0_1726538122376.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = RoBertaForTokenClassification.pretrained("bsc_bio_ehr_spanish_symptemist_word2vec_8_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = RoBertaForTokenClassification.pretrained("bsc_bio_ehr_spanish_symptemist_word2vec_8_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bsc_bio_ehr_spanish_symptemist_word2vec_8_ner| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|435.1 MB| + +## References + +https://huggingface.co/Rodrigo1771/bsc-bio-ehr-es-symptemist-word2vec-8-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-17-burmese_awesome_qa_model_adalee1001_en.md b/docs/_posts/ahmedlone127/2024-09-17-burmese_awesome_qa_model_adalee1001_en.md new file mode 100644 index 00000000000000..760911d6df89a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-17-burmese_awesome_qa_model_adalee1001_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English burmese_awesome_qa_model_adalee1001 DistilBertForQuestionAnswering from Adalee1001 +author: John Snow Labs +name: burmese_awesome_qa_model_adalee1001 +date: 2024-09-17 +tags: [en, open_source, onnx, question_answering, distilbert] +task: Question Answering +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_adalee1001` is a English model originally trained by Adalee1001. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_adalee1001_en_5.5.0_3.0_1726599603991.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_adalee1001_en_5.5.0_3.0_1726599603991.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_adalee1001","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_adalee1001", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_adalee1001| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Adalee1001/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-17-burmese_awesome_text_classification_yu_en_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-17-burmese_awesome_text_classification_yu_en_pipeline_en.md new file mode 100644 index 00000000000000..942ff8d2a3d88f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-17-burmese_awesome_text_classification_yu_en_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English burmese_awesome_text_classification_yu_en_pipeline pipeline DistilBertForSequenceClassification from Yu-En +author: John Snow Labs +name: burmese_awesome_text_classification_yu_en_pipeline +date: 2024-09-17 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_text_classification_yu_en_pipeline` is a English model originally trained by Yu-En. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_text_classification_yu_en_pipeline_en_5.5.0_3.0_1726594286389.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_text_classification_yu_en_pipeline_en_5.5.0_3.0_1726594286389.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("burmese_awesome_text_classification_yu_en_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("burmese_awesome_text_classification_yu_en_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_text_classification_yu_en_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/Yu-En/my-awesome-text-classification + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-17-tmp_trainer_dstaples08_en.md b/docs/_posts/ahmedlone127/2024-09-17-tmp_trainer_dstaples08_en.md new file mode 100644 index 00000000000000..d13801461699fc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-17-tmp_trainer_dstaples08_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English tmp_trainer_dstaples08 DistilBertForSequenceClassification from dstaples08 +author: John Snow Labs +name: tmp_trainer_dstaples08 +date: 2024-09-17 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tmp_trainer_dstaples08` is a English model originally trained by dstaples08. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tmp_trainer_dstaples08_en_5.5.0_3.0_1726593658439.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tmp_trainer_dstaples08_en_5.5.0_3.0_1726593658439.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("tmp_trainer_dstaples08","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("tmp_trainer_dstaples08", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tmp_trainer_dstaples08| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/dstaples08/tmp_trainer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-17-whisper_small_turkish_muhtasham_tr.md b/docs/_posts/ahmedlone127/2024-09-17-whisper_small_turkish_muhtasham_tr.md new file mode 100644 index 00000000000000..70b00bcb974bf8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-17-whisper_small_turkish_muhtasham_tr.md @@ -0,0 +1,84 @@ +--- +layout: model +title: Turkish whisper_small_turkish_muhtasham WhisperForCTC from muhtasham +author: John Snow Labs +name: whisper_small_turkish_muhtasham +date: 2024-09-17 +tags: [tr, open_source, onnx, asr, whisper] +task: Automatic Speech Recognition +language: tr +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: WhisperForCTC +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`whisper_small_turkish_muhtasham` is a Turkish model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/whisper_small_turkish_muhtasham_tr_5.5.0_3.0_1726569200533.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/whisper_small_turkish_muhtasham_tr_5.5.0_3.0_1726569200533.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +audioAssembler = AudioAssembler() \ + .setInputCol("audio_content") \ + .setOutputCol("audio_assembler") + +speechToText = WhisperForCTC.pretrained("whisper_small_turkish_muhtasham","tr") \ + .setInputCols(["audio_assembler"]) \ + .setOutputCol("text") + +pipeline = Pipeline().setStages([audioAssembler, speechToText]) +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val audioAssembler = new DocumentAssembler() + .setInputCols("audio_content") + .setOutputCols("audio_assembler") + +val speechToText = WhisperForCTC.pretrained("whisper_small_turkish_muhtasham", "tr") + .setInputCols(Array("audio_assembler")) + .setOutputCol("text") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, speechToText)) +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|whisper_small_turkish_muhtasham| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[audio_assembler]| +|Output Labels:|[text]| +|Language:|tr| +|Size:|1.7 GB| + +## References + +https://huggingface.co/muhtasham/whisper-small-tr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-17-whisper_tiny_minds14_fitrahgiffari63_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-17-whisper_tiny_minds14_fitrahgiffari63_pipeline_en.md new file mode 100644 index 00000000000000..27d7946d8c3e4e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-17-whisper_tiny_minds14_fitrahgiffari63_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English whisper_tiny_minds14_fitrahgiffari63_pipeline pipeline WhisperForCTC from fitrahgiffari63 +author: John Snow Labs +name: whisper_tiny_minds14_fitrahgiffari63_pipeline +date: 2024-09-17 +tags: [en, open_source, pipeline, onnx] +task: Automatic Speech Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`whisper_tiny_minds14_fitrahgiffari63_pipeline` is a English model originally trained by fitrahgiffari63. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/whisper_tiny_minds14_fitrahgiffari63_pipeline_en_5.5.0_3.0_1726550325528.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/whisper_tiny_minds14_fitrahgiffari63_pipeline_en_5.5.0_3.0_1726550325528.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("whisper_tiny_minds14_fitrahgiffari63_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("whisper_tiny_minds14_fitrahgiffari63_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|whisper_tiny_minds14_fitrahgiffari63_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|390.9 MB| + +## References + +https://huggingface.co/fitrahgiffari63/whisper-tiny-minds14 + +## Included Models + +- AudioAssembler +- WhisperForCTC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-18-distilbert_base_uncased_finetuned_m_help_seller_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-18-distilbert_base_uncased_finetuned_m_help_seller_pipeline_en.md new file mode 100644 index 00000000000000..85efbd3244041a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-18-distilbert_base_uncased_finetuned_m_help_seller_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_m_help_seller_pipeline pipeline DistilBertForSequenceClassification from Gregorig +author: John Snow Labs +name: distilbert_base_uncased_finetuned_m_help_seller_pipeline +date: 2024-09-18 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_m_help_seller_pipeline` is a English model originally trained by Gregorig. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_m_help_seller_pipeline_en_5.5.0_3.0_1726680729383.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_m_help_seller_pipeline_en_5.5.0_3.0_1726680729383.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_m_help_seller_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_m_help_seller_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_m_help_seller_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Gregorig/distilbert-base-uncased-finetuned-m_help_seller + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-18-finetuning_distilbert_sentiment_model_en.md b/docs/_posts/ahmedlone127/2024-09-18-finetuning_distilbert_sentiment_model_en.md new file mode 100644 index 00000000000000..8ae2601f37c21a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-18-finetuning_distilbert_sentiment_model_en.md @@ -0,0 +1,98 @@ +--- +layout: model +title: English finetuning_distilbert_sentiment_model DistilBertForSequenceClassification from Annamaziarz1 +author: John Snow Labs +name: finetuning_distilbert_sentiment_model +date: 2024-09-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_distilbert_sentiment_model` is a English model originally trained by Annamaziarz1. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_distilbert_sentiment_model_en_5.5.0_3.0_1726681688720.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_distilbert_sentiment_model_en_5.5.0_3.0_1726681688720.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_distilbert_sentiment_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_distilbert_sentiment_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_distilbert_sentiment_model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|507.6 MB| + +## References + +References + +https://huggingface.co/Annamaziarz1/finetuning-distilbert-sentiment-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-18-kltn_csi_xlm_en.md b/docs/_posts/ahmedlone127/2024-09-18-kltn_csi_xlm_en.md new file mode 100644 index 00000000000000..8f56fe1f5dab25 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-18-kltn_csi_xlm_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English kltn_csi_xlm XlmRoBertaForSequenceClassification from ThuyNT03 +author: John Snow Labs +name: kltn_csi_xlm +date: 2024-09-18 +tags: [en, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kltn_csi_xlm` is a English model originally trained by ThuyNT03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kltn_csi_xlm_en_5.5.0_3.0_1726697854561.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kltn_csi_xlm_en_5.5.0_3.0_1726697854561.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("kltn_csi_xlm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("kltn_csi_xlm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kltn_csi_xlm| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|773.9 MB| + +## References + +https://huggingface.co/ThuyNT03/KLTN_CSI_xlm \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-18-sent_bert_base_uncased_mlp_scirepeval_chemistry_large_en.md b/docs/_posts/ahmedlone127/2024-09-18-sent_bert_base_uncased_mlp_scirepeval_chemistry_large_en.md new file mode 100644 index 00000000000000..45efe14f983346 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-18-sent_bert_base_uncased_mlp_scirepeval_chemistry_large_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sent_bert_base_uncased_mlp_scirepeval_chemistry_large BertSentenceEmbeddings from jonas-luehrs +author: John Snow Labs +name: sent_bert_base_uncased_mlp_scirepeval_chemistry_large +date: 2024-09-18 +tags: [en, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_bert_base_uncased_mlp_scirepeval_chemistry_large` is a English model originally trained by jonas-luehrs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_bert_base_uncased_mlp_scirepeval_chemistry_large_en_5.5.0_3.0_1726687375678.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_bert_base_uncased_mlp_scirepeval_chemistry_large_en_5.5.0_3.0_1726687375678.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_bert_base_uncased_mlp_scirepeval_chemistry_large","en") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_bert_base_uncased_mlp_scirepeval_chemistry_large","en") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_bert_base_uncased_mlp_scirepeval_chemistry_large| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/jonas-luehrs/bert-base-uncased-MLP-scirepeval-chemistry-LARGE \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-18-sent_esmlmt60_10000_en.md b/docs/_posts/ahmedlone127/2024-09-18-sent_esmlmt60_10000_en.md new file mode 100644 index 00000000000000..d118e842b152af --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-18-sent_esmlmt60_10000_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sent_esmlmt60_10000 BertSentenceEmbeddings from hjkim811 +author: John Snow Labs +name: sent_esmlmt60_10000 +date: 2024-09-18 +tags: [en, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_esmlmt60_10000` is a English model originally trained by hjkim811. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_esmlmt60_10000_en_5.5.0_3.0_1726675877631.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_esmlmt60_10000_en_5.5.0_3.0_1726675877631.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_esmlmt60_10000","en") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_esmlmt60_10000","en") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_esmlmt60_10000| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/hjkim811/esmlmt60-10000 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-18-text_mining_project_distil_bert_en.md b/docs/_posts/ahmedlone127/2024-09-18-text_mining_project_distil_bert_en.md new file mode 100644 index 00000000000000..c044ff2a8610ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-18-text_mining_project_distil_bert_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English text_mining_project_distil_bert DistilBertForSequenceClassification from algmarques +author: John Snow Labs +name: text_mining_project_distil_bert +date: 2024-09-18 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`text_mining_project_distil_bert` is a English model originally trained by algmarques. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/text_mining_project_distil_bert_en_5.5.0_3.0_1726631042499.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/text_mining_project_distil_bert_en_5.5.0_3.0_1726631042499.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("text_mining_project_distil_bert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("text_mining_project_distil_bert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|text_mining_project_distil_bert| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/algmarques/text_mining_project_distil_bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-18-transient_data_en.md b/docs/_posts/ahmedlone127/2024-09-18-transient_data_en.md new file mode 100644 index 00000000000000..bab36a286faa0d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-18-transient_data_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English transient_data DistilBertForSequenceClassification from Jingni +author: John Snow Labs +name: transient_data +date: 2024-09-18 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`transient_data` is a English model originally trained by Jingni. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/transient_data_en_5.5.0_3.0_1726670083176.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/transient_data_en_5.5.0_3.0_1726670083176.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("transient_data","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("transient_data", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|transient_data| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Jingni/transient_data \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-18-xlm_roberta_base_finetuned_panx_german_km0228kr_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-18-xlm_roberta_base_finetuned_panx_german_km0228kr_pipeline_en.md new file mode 100644 index 00000000000000..22450deb54e7bb --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-18-xlm_roberta_base_finetuned_panx_german_km0228kr_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_km0228kr_pipeline pipeline XlmRoBertaForTokenClassification from km0228kr +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_km0228kr_pipeline +date: 2024-09-18 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_km0228kr_pipeline` is a English model originally trained by km0228kr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_km0228kr_pipeline_en_5.5.0_3.0_1726635325693.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_km0228kr_pipeline_en_5.5.0_3.0_1726635325693.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_km0228kr_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_km0228kr_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_km0228kr_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|853.8 MB| + +## References + +https://huggingface.co/km0228kr/xlm-roberta-base-finetuned-panx-de + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-18-xlm_roberta_base_finetuned_panx_italian_taoyoung_en.md b/docs/_posts/ahmedlone127/2024-09-18-xlm_roberta_base_finetuned_panx_italian_taoyoung_en.md new file mode 100644 index 00000000000000..a197d8337ad1d9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-18-xlm_roberta_base_finetuned_panx_italian_taoyoung_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_italian_taoyoung XlmRoBertaForTokenClassification from taoyoung +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_italian_taoyoung +date: 2024-09-18 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_italian_taoyoung` is a English model originally trained by taoyoung. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_italian_taoyoung_en_5.5.0_3.0_1726656653715.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_italian_taoyoung_en_5.5.0_3.0_1726656653715.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_italian_taoyoung","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_italian_taoyoung", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_italian_taoyoung| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|824.2 MB| + +## References + +https://huggingface.co/taoyoung/xlm-roberta-base-finetuned-panx-it \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-19-distilbert_base_uncased_finetuned_squad_d5716d28_suzuki0829_en.md b/docs/_posts/ahmedlone127/2024-09-19-distilbert_base_uncased_finetuned_squad_d5716d28_suzuki0829_en.md new file mode 100644 index 00000000000000..e0566cae862281 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-19-distilbert_base_uncased_finetuned_squad_d5716d28_suzuki0829_en.md @@ -0,0 +1,92 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_d5716d28_suzuki0829 DistilBertEmbeddings from suzuki0829 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_d5716d28_suzuki0829 +date: 2024-09-19 +tags: [distilbert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_d5716d28_suzuki0829` is a English model originally trained by suzuki0829. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_d5716d28_suzuki0829_en_5.5.0_3.0_1726727640228.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_d5716d28_suzuki0829_en_5.5.0_3.0_1726727640228.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_squad_d5716d28_suzuki0829","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) +``` +```scala +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = DistilBertEmbeddings + .pretrained("distilbert_base_uncased_finetuned_squad_d5716d28_suzuki0829", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_d5716d28_suzuki0829| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +References + +https://huggingface.co/suzuki0829/distilbert-base-uncased-finetuned-squad-d5716d28 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-19-japanese_fine_tuned_whisper_model_nikolajvestergaard_ja.md b/docs/_posts/ahmedlone127/2024-09-19-japanese_fine_tuned_whisper_model_nikolajvestergaard_ja.md new file mode 100644 index 00000000000000..d54e2ea8f35f75 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-19-japanese_fine_tuned_whisper_model_nikolajvestergaard_ja.md @@ -0,0 +1,84 @@ +--- +layout: model +title: Japanese japanese_fine_tuned_whisper_model_nikolajvestergaard WhisperForCTC from Nikolajvestergaard +author: John Snow Labs +name: japanese_fine_tuned_whisper_model_nikolajvestergaard +date: 2024-09-19 +tags: [ja, open_source, onnx, asr, whisper] +task: Automatic Speech Recognition +language: ja +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: WhisperForCTC +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`japanese_fine_tuned_whisper_model_nikolajvestergaard` is a Japanese model originally trained by Nikolajvestergaard. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/japanese_fine_tuned_whisper_model_nikolajvestergaard_ja_5.5.0_3.0_1726759322006.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/japanese_fine_tuned_whisper_model_nikolajvestergaard_ja_5.5.0_3.0_1726759322006.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +audioAssembler = AudioAssembler() \ + .setInputCol("audio_content") \ + .setOutputCol("audio_assembler") + +speechToText = WhisperForCTC.pretrained("japanese_fine_tuned_whisper_model_nikolajvestergaard","ja") \ + .setInputCols(["audio_assembler"]) \ + .setOutputCol("text") + +pipeline = Pipeline().setStages([audioAssembler, speechToText]) +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val audioAssembler = new DocumentAssembler() + .setInputCols("audio_content") + .setOutputCols("audio_assembler") + +val speechToText = WhisperForCTC.pretrained("japanese_fine_tuned_whisper_model_nikolajvestergaard", "ja") + .setInputCols(Array("audio_assembler")) + .setOutputCol("text") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, speechToText)) +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|japanese_fine_tuned_whisper_model_nikolajvestergaard| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[audio_assembler]| +|Output Labels:|[text]| +|Language:|ja| +|Size:|390.9 MB| + +## References + +https://huggingface.co/Nikolajvestergaard/Japanese_Fine_Tuned_Whisper_Model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-19-roberta_large_e2_noweight_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-19-roberta_large_e2_noweight_pipeline_en.md new file mode 100644 index 00000000000000..998bff67b1489f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-19-roberta_large_e2_noweight_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English roberta_large_e2_noweight_pipeline pipeline RoBertaForSequenceClassification from JerryYanJiang +author: John Snow Labs +name: roberta_large_e2_noweight_pipeline +date: 2024-09-19 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_e2_noweight_pipeline` is a English model originally trained by JerryYanJiang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_e2_noweight_pipeline_en_5.5.0_3.0_1726733830029.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_e2_noweight_pipeline_en_5.5.0_3.0_1726733830029.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("roberta_large_e2_noweight_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("roberta_large_e2_noweight_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_e2_noweight_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/JerryYanJiang/roberta-large-e2-noweight + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-19-roberta_untrained_2eps_seed408_en.md b/docs/_posts/ahmedlone127/2024-09-19-roberta_untrained_2eps_seed408_en.md new file mode 100644 index 00000000000000..88df1b9bb161e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-19-roberta_untrained_2eps_seed408_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English roberta_untrained_2eps_seed408 RoBertaForSequenceClassification from custeau +author: John Snow Labs +name: roberta_untrained_2eps_seed408 +date: 2024-09-19 +tags: [en, open_source, onnx, sequence_classification, roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_untrained_2eps_seed408` is a English model originally trained by custeau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_untrained_2eps_seed408_en_5.5.0_3.0_1726725772348.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_untrained_2eps_seed408_en_5.5.0_3.0_1726725772348.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_untrained_2eps_seed408","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_untrained_2eps_seed408", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_untrained_2eps_seed408| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|447.9 MB| + +## References + +https://huggingface.co/custeau/roberta_untrained_2eps_seed408 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-19-whisper_small_russian_v3_ru.md b/docs/_posts/ahmedlone127/2024-09-19-whisper_small_russian_v3_ru.md new file mode 100644 index 00000000000000..a3b6a5efaf66a1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-19-whisper_small_russian_v3_ru.md @@ -0,0 +1,84 @@ +--- +layout: model +title: Russian whisper_small_russian_v3 WhisperForCTC from sam-alavardo-1980 +author: John Snow Labs +name: whisper_small_russian_v3 +date: 2024-09-19 +tags: [ru, open_source, onnx, asr, whisper] +task: Automatic Speech Recognition +language: ru +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: WhisperForCTC +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`whisper_small_russian_v3` is a Russian model originally trained by sam-alavardo-1980. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/whisper_small_russian_v3_ru_5.5.0_3.0_1726755907010.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/whisper_small_russian_v3_ru_5.5.0_3.0_1726755907010.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +audioAssembler = AudioAssembler() \ + .setInputCol("audio_content") \ + .setOutputCol("audio_assembler") + +speechToText = WhisperForCTC.pretrained("whisper_small_russian_v3","ru") \ + .setInputCols(["audio_assembler"]) \ + .setOutputCol("text") + +pipeline = Pipeline().setStages([audioAssembler, speechToText]) +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val audioAssembler = new DocumentAssembler() + .setInputCols("audio_content") + .setOutputCols("audio_assembler") + +val speechToText = WhisperForCTC.pretrained("whisper_small_russian_v3", "ru") + .setInputCols(Array("audio_assembler")) + .setOutputCol("text") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, speechToText)) +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|whisper_small_russian_v3| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[audio_assembler]| +|Output Labels:|[text]| +|Language:|ru| +|Size:|1.7 GB| + +## References + +https://huggingface.co/sam-alavardo-1980/whisper-small-ru-v3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-20-absa_restaurant_froberta_base_v2_en.md b/docs/_posts/ahmedlone127/2024-09-20-absa_restaurant_froberta_base_v2_en.md new file mode 100644 index 00000000000000..6aeb08c9bd9c21 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-20-absa_restaurant_froberta_base_v2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English absa_restaurant_froberta_base_v2 RoBertaEmbeddings from AliAhmad001 +author: John Snow Labs +name: absa_restaurant_froberta_base_v2 +date: 2024-09-20 +tags: [en, open_source, onnx, embeddings, roberta] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`absa_restaurant_froberta_base_v2` is a English model originally trained by AliAhmad001. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/absa_restaurant_froberta_base_v2_en_5.5.0_3.0_1726857784982.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/absa_restaurant_froberta_base_v2_en_5.5.0_3.0_1726857784982.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = RoBertaEmbeddings.pretrained("absa_restaurant_froberta_base_v2","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = RoBertaEmbeddings.pretrained("absa_restaurant_froberta_base_v2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|absa_restaurant_froberta_base_v2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[roberta]| +|Language:|en| +|Size:|466.0 MB| + +## References + +https://huggingface.co/AliAhmad001/absa-restaurant-froberta-base-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-20-augmented_model_fast_2_c_norwegian_copula_norwegian_time_en.md b/docs/_posts/ahmedlone127/2024-09-20-augmented_model_fast_2_c_norwegian_copula_norwegian_time_en.md new file mode 100644 index 00000000000000..177b81211b02da --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-20-augmented_model_fast_2_c_norwegian_copula_norwegian_time_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English augmented_model_fast_2_c_norwegian_copula_norwegian_time DistilBertForSequenceClassification from LeonardoFettucciari +author: John Snow Labs +name: augmented_model_fast_2_c_norwegian_copula_norwegian_time +date: 2024-09-20 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`augmented_model_fast_2_c_norwegian_copula_norwegian_time` is a English model originally trained by LeonardoFettucciari. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/augmented_model_fast_2_c_norwegian_copula_norwegian_time_en_5.5.0_3.0_1726871744638.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/augmented_model_fast_2_c_norwegian_copula_norwegian_time_en_5.5.0_3.0_1726871744638.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("augmented_model_fast_2_c_norwegian_copula_norwegian_time","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("augmented_model_fast_2_c_norwegian_copula_norwegian_time", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|augmented_model_fast_2_c_norwegian_copula_norwegian_time| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/LeonardoFettucciari/augmented_model_fast_2_c_NO_COPULA_NO_TIME \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-20-bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_0004_swati_0_en.md b/docs/_posts/ahmedlone127/2024-09-20-bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_0004_swati_0_en.md new file mode 100644 index 00000000000000..b46a946aecc654 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-20-bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_0004_swati_0_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_0004_swati_0 BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_0004_swati_0 +date: 2024-09-20 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_0004_swati_0` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_0004_swati_0_en_5.5.0_3.0_1726833873918.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_0004_swati_0_en_5.5.0_3.0_1726833873918.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_0004_swati_0","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_0004_swati_0", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_0004_swati_0| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-finetune-squad-ep-1.0-lr-1e-05-wd-0.001-dp-0.0004-ss-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-20-bert_base_uncased_vitamin_c_fact_verification_en.md b/docs/_posts/ahmedlone127/2024-09-20-bert_base_uncased_vitamin_c_fact_verification_en.md new file mode 100644 index 00000000000000..d4cf0b80322535 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-20-bert_base_uncased_vitamin_c_fact_verification_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bert_base_uncased_vitamin_c_fact_verification BertForQuestionAnswering from DunnBC22 +author: John Snow Labs +name: bert_base_uncased_vitamin_c_fact_verification +date: 2024-09-20 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_vitamin_c_fact_verification` is a English model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_vitamin_c_fact_verification_en_5.5.0_3.0_1726820758943.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_vitamin_c_fact_verification_en_5.5.0_3.0_1726820758943.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_vitamin_c_fact_verification","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_vitamin_c_fact_verification", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_vitamin_c_fact_verification| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/DunnBC22/bert-base-uncased-Vitamin_C_Fact_Verification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-20-bsc_bio_ehr_spanish_combined_train_distemist_dev_word2vec_85_ner_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-20-bsc_bio_ehr_spanish_combined_train_distemist_dev_word2vec_85_ner_pipeline_en.md new file mode 100644 index 00000000000000..e0b7da26ac6d17 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-20-bsc_bio_ehr_spanish_combined_train_distemist_dev_word2vec_85_ner_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bsc_bio_ehr_spanish_combined_train_distemist_dev_word2vec_85_ner_pipeline pipeline RoBertaForTokenClassification from Rodrigo1771 +author: John Snow Labs +name: bsc_bio_ehr_spanish_combined_train_distemist_dev_word2vec_85_ner_pipeline +date: 2024-09-20 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bsc_bio_ehr_spanish_combined_train_distemist_dev_word2vec_85_ner_pipeline` is a English model originally trained by Rodrigo1771. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bsc_bio_ehr_spanish_combined_train_distemist_dev_word2vec_85_ner_pipeline_en_5.5.0_3.0_1726847477378.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bsc_bio_ehr_spanish_combined_train_distemist_dev_word2vec_85_ner_pipeline_en_5.5.0_3.0_1726847477378.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bsc_bio_ehr_spanish_combined_train_distemist_dev_word2vec_85_ner_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bsc_bio_ehr_spanish_combined_train_distemist_dev_word2vec_85_ner_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bsc_bio_ehr_spanish_combined_train_distemist_dev_word2vec_85_ner_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|440.6 MB| + +## References + +https://huggingface.co/Rodrigo1771/bsc-bio-ehr-es-combined-train-distemist-dev-word2vec-85-ner + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-20-burmese_awesome_model_robinsh2023_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-20-burmese_awesome_model_robinsh2023_pipeline_en.md new file mode 100644 index 00000000000000..c53fda86fcc133 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-20-burmese_awesome_model_robinsh2023_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English burmese_awesome_model_robinsh2023_pipeline pipeline DistilBertForSequenceClassification from Robinsh2023 +author: John Snow Labs +name: burmese_awesome_model_robinsh2023_pipeline +date: 2024-09-20 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_model_robinsh2023_pipeline` is a English model originally trained by Robinsh2023. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_robinsh2023_pipeline_en_5.5.0_3.0_1726809000095.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_robinsh2023_pipeline_en_5.5.0_3.0_1726809000095.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("burmese_awesome_model_robinsh2023_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("burmese_awesome_model_robinsh2023_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_model_robinsh2023_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Robinsh2023/my_awesome_model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-20-distilbert_base_uncased_finetuned_cate_classfication_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-20-distilbert_base_uncased_finetuned_cate_classfication_pipeline_en.md new file mode 100644 index 00000000000000..21b660e91e8299 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-20-distilbert_base_uncased_finetuned_cate_classfication_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cate_classfication_pipeline pipeline DistilBertForSequenceClassification from shnguo +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cate_classfication_pipeline +date: 2024-09-20 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cate_classfication_pipeline` is a English model originally trained by shnguo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cate_classfication_pipeline_en_5.5.0_3.0_1726823577140.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cate_classfication_pipeline_en_5.5.0_3.0_1726823577140.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_cate_classfication_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_cate_classfication_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cate_classfication_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|253.6 MB| + +## References + +https://huggingface.co/shnguo/distilbert-base-uncased-finetuned-cate-classfication + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-20-distilbert_base_uncased_finetuned_emotion_minseok0109_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-20-distilbert_base_uncased_finetuned_emotion_minseok0109_pipeline_en.md new file mode 100644 index 00000000000000..612c9634d84342 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-20-distilbert_base_uncased_finetuned_emotion_minseok0109_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_minseok0109_pipeline pipeline DistilBertForSequenceClassification from minseok0109 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_minseok0109_pipeline +date: 2024-09-20 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_minseok0109_pipeline` is a English model originally trained by minseok0109. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_minseok0109_pipeline_en_5.5.0_3.0_1726823597918.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_minseok0109_pipeline_en_5.5.0_3.0_1726823597918.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_emotion_minseok0109_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_emotion_minseok0109_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_minseok0109_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/minseok0109/distilbert-base-uncased-finetuned-emotion + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-20-distilbert_base_uncased_finetuned_emotion_shng2025_en.md b/docs/_posts/ahmedlone127/2024-09-20-distilbert_base_uncased_finetuned_emotion_shng2025_en.md new file mode 100644 index 00000000000000..35307329560f93 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-20-distilbert_base_uncased_finetuned_emotion_shng2025_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_shng2025 DistilBertForSequenceClassification from shng2025 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_shng2025 +date: 2024-09-20 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_shng2025` is a English model originally trained by shng2025. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_shng2025_en_5.5.0_3.0_1726871504276.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_shng2025_en_5.5.0_3.0_1726871504276.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_shng2025","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_shng2025", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_shng2025| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/shng2025/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-20-distilbert_base_uncased_finetuned_imdb_h40vv3n_en.md b/docs/_posts/ahmedlone127/2024-09-20-distilbert_base_uncased_finetuned_imdb_h40vv3n_en.md new file mode 100644 index 00000000000000..54916adb28c0c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-20-distilbert_base_uncased_finetuned_imdb_h40vv3n_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_h40vv3n DistilBertEmbeddings from h40vv3n +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_h40vv3n +date: 2024-09-20 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_h40vv3n` is a English model originally trained by h40vv3n. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_h40vv3n_en_5.5.0_3.0_1726818119010.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_h40vv3n_en_5.5.0_3.0_1726818119010.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_h40vv3n","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_h40vv3n","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_h40vv3n| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/h40vv3n/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-20-distilbert_base_uncased_finetuned_pad_mult_clf_v2_en.md b/docs/_posts/ahmedlone127/2024-09-20-distilbert_base_uncased_finetuned_pad_mult_clf_v2_en.md new file mode 100644 index 00000000000000..e872f153a0e9a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-20-distilbert_base_uncased_finetuned_pad_mult_clf_v2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_pad_mult_clf_v2 DistilBertForSequenceClassification from netoferraz +author: John Snow Labs +name: distilbert_base_uncased_finetuned_pad_mult_clf_v2 +date: 2024-09-20 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_pad_mult_clf_v2` is a English model originally trained by netoferraz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_pad_mult_clf_v2_en_5.5.0_3.0_1726809524710.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_pad_mult_clf_v2_en_5.5.0_3.0_1726809524710.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_pad_mult_clf_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_pad_mult_clf_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_pad_mult_clf_v2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/netoferraz/distilbert-base-uncased-finetuned-pad-mult-clf-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-20-distilbert_base_uncased_odm_zphr_0st8sd_ut72ut1large8pfxnf_simsp400_clean300_en.md b/docs/_posts/ahmedlone127/2024-09-20-distilbert_base_uncased_odm_zphr_0st8sd_ut72ut1large8pfxnf_simsp400_clean300_en.md new file mode 100644 index 00000000000000..958f5207736d0e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-20-distilbert_base_uncased_odm_zphr_0st8sd_ut72ut1large8pfxnf_simsp400_clean300_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_odm_zphr_0st8sd_ut72ut1large8pfxnf_simsp400_clean300 DistilBertForSequenceClassification from tom192180 +author: John Snow Labs +name: distilbert_base_uncased_odm_zphr_0st8sd_ut72ut1large8pfxnf_simsp400_clean300 +date: 2024-09-20 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_odm_zphr_0st8sd_ut72ut1large8pfxnf_simsp400_clean300` is a English model originally trained by tom192180. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_odm_zphr_0st8sd_ut72ut1large8pfxnf_simsp400_clean300_en_5.5.0_3.0_1726829957204.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_odm_zphr_0st8sd_ut72ut1large8pfxnf_simsp400_clean300_en_5.5.0_3.0_1726829957204.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_odm_zphr_0st8sd_ut72ut1large8pfxnf_simsp400_clean300","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_odm_zphr_0st8sd_ut72ut1large8pfxnf_simsp400_clean300", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_odm_zphr_0st8sd_ut72ut1large8pfxnf_simsp400_clean300| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.6 MB| + +## References + +https://huggingface.co/tom192180/distilbert-base-uncased_odm_zphr_0st8sd_ut72ut1large8PfxNf_simsp400_clean300 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-20-finetuning_sentiment_model_3000_samples_david1987bb_en.md b/docs/_posts/ahmedlone127/2024-09-20-finetuning_sentiment_model_3000_samples_david1987bb_en.md new file mode 100644 index 00000000000000..a358a9edc31622 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-20-finetuning_sentiment_model_3000_samples_david1987bb_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_david1987bb DistilBertForSequenceClassification from David1987BB +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_david1987bb +date: 2024-09-20 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_david1987bb` is a English model originally trained by David1987BB. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_david1987bb_en_5.5.0_3.0_1726848694391.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_david1987bb_en_5.5.0_3.0_1726848694391.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_david1987bb","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_david1987bb", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_david1987bb| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/David1987BB/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-20-finetuning_sentiment_model_3000_samples_yeabinml_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-20-finetuning_sentiment_model_3000_samples_yeabinml_pipeline_en.md new file mode 100644 index 00000000000000..3f0b648e64b2b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-20-finetuning_sentiment_model_3000_samples_yeabinml_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_yeabinml_pipeline pipeline DistilBertForSequenceClassification from Yeabinml +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_yeabinml_pipeline +date: 2024-09-20 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_yeabinml_pipeline` is a English model originally trained by Yeabinml. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_yeabinml_pipeline_en_5.5.0_3.0_1726871361180.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_yeabinml_pipeline_en_5.5.0_3.0_1726871361180.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("finetuning_sentiment_model_3000_samples_yeabinml_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("finetuning_sentiment_model_3000_samples_yeabinml_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_yeabinml_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Yeabinml/finetuning-sentiment-model-3000-samples + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-20-finetuning_sentiment_model_3500_samples_train_yvillamil_en.md b/docs/_posts/ahmedlone127/2024-09-20-finetuning_sentiment_model_3500_samples_train_yvillamil_en.md new file mode 100644 index 00000000000000..000170446b0fd5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-20-finetuning_sentiment_model_3500_samples_train_yvillamil_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English finetuning_sentiment_model_3500_samples_train_yvillamil DistilBertForSequenceClassification from yvillamil +author: John Snow Labs +name: finetuning_sentiment_model_3500_samples_train_yvillamil +date: 2024-09-20 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3500_samples_train_yvillamil` is a English model originally trained by yvillamil. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3500_samples_train_yvillamil_en_5.5.0_3.0_1726823471007.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3500_samples_train_yvillamil_en_5.5.0_3.0_1726823471007.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3500_samples_train_yvillamil","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3500_samples_train_yvillamil", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3500_samples_train_yvillamil| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/yvillamil/finetuning-sentiment-model-3500-samples-train \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-20-kinyaroberta_large_kinte_finetuned_kinyarwanda_tweet_finetuned_kinyarwanda_sent2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-20-kinyaroberta_large_kinte_finetuned_kinyarwanda_tweet_finetuned_kinyarwanda_sent2_pipeline_en.md new file mode 100644 index 00000000000000..763af79b918a31 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-20-kinyaroberta_large_kinte_finetuned_kinyarwanda_tweet_finetuned_kinyarwanda_sent2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English kinyaroberta_large_kinte_finetuned_kinyarwanda_tweet_finetuned_kinyarwanda_sent2_pipeline pipeline RoBertaForSequenceClassification from RogerB +author: John Snow Labs +name: kinyaroberta_large_kinte_finetuned_kinyarwanda_tweet_finetuned_kinyarwanda_sent2_pipeline +date: 2024-09-20 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kinyaroberta_large_kinte_finetuned_kinyarwanda_tweet_finetuned_kinyarwanda_sent2_pipeline` is a English model originally trained by RogerB. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kinyaroberta_large_kinte_finetuned_kinyarwanda_tweet_finetuned_kinyarwanda_sent2_pipeline_en_5.5.0_3.0_1726804614586.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kinyaroberta_large_kinte_finetuned_kinyarwanda_tweet_finetuned_kinyarwanda_sent2_pipeline_en_5.5.0_3.0_1726804614586.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("kinyaroberta_large_kinte_finetuned_kinyarwanda_tweet_finetuned_kinyarwanda_sent2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("kinyaroberta_large_kinte_finetuned_kinyarwanda_tweet_finetuned_kinyarwanda_sent2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kinyaroberta_large_kinte_finetuned_kinyarwanda_tweet_finetuned_kinyarwanda_sent2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.3 MB| + +## References + +https://huggingface.co/RogerB/kinyaRoberta-large-kinte-finetuned-kin-tweet-finetuned-kin-sent2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-20-nlp2_base_3e_4_nathanjlee_en.md b/docs/_posts/ahmedlone127/2024-09-20-nlp2_base_3e_4_nathanjlee_en.md new file mode 100644 index 00000000000000..72f9fcb651ed00 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-20-nlp2_base_3e_4_nathanjlee_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English nlp2_base_3e_4_nathanjlee DistilBertForSequenceClassification from NathanJLee +author: John Snow Labs +name: nlp2_base_3e_4_nathanjlee +date: 2024-09-20 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nlp2_base_3e_4_nathanjlee` is a English model originally trained by NathanJLee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nlp2_base_3e_4_nathanjlee_en_5.5.0_3.0_1726849201280.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nlp2_base_3e_4_nathanjlee_en_5.5.0_3.0_1726849201280.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("nlp2_base_3e_4_nathanjlee","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("nlp2_base_3e_4_nathanjlee", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nlp2_base_3e_4_nathanjlee| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/NathanJLee/NLP2_Base_3e-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-20-roberta_baseline_finetuned_atis_3pct_v2_en.md b/docs/_posts/ahmedlone127/2024-09-20-roberta_baseline_finetuned_atis_3pct_v2_en.md new file mode 100644 index 00000000000000..33e41c121f2ec6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-20-roberta_baseline_finetuned_atis_3pct_v2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English roberta_baseline_finetuned_atis_3pct_v2 RoBertaForSequenceClassification from benayas +author: John Snow Labs +name: roberta_baseline_finetuned_atis_3pct_v2 +date: 2024-09-20 +tags: [en, open_source, onnx, sequence_classification, roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_baseline_finetuned_atis_3pct_v2` is a English model originally trained by benayas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_baseline_finetuned_atis_3pct_v2_en_5.5.0_3.0_1726851816619.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_baseline_finetuned_atis_3pct_v2_en_5.5.0_3.0_1726851816619.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_baseline_finetuned_atis_3pct_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_baseline_finetuned_atis_3pct_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_baseline_finetuned_atis_3pct_v2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|416.8 MB| + +## References + +https://huggingface.co/benayas/roberta-baseline-finetuned-atis_3pct_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-20-roberta_large_ncbi_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-20-roberta_large_ncbi_pipeline_en.md new file mode 100644 index 00000000000000..b9c3edd8daf92a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-20-roberta_large_ncbi_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English roberta_large_ncbi_pipeline pipeline RoBertaForTokenClassification from CheccoCando +author: John Snow Labs +name: roberta_large_ncbi_pipeline +date: 2024-09-20 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_ncbi_pipeline` is a English model originally trained by CheccoCando. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_ncbi_pipeline_en_5.5.0_3.0_1726862389611.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_ncbi_pipeline_en_5.5.0_3.0_1726862389611.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("roberta_large_ncbi_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("roberta_large_ncbi_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_ncbi_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/CheccoCando/roberta-large_ncbi + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-20-roberta_ner_devanshrj_en.md b/docs/_posts/ahmedlone127/2024-09-20-roberta_ner_devanshrj_en.md new file mode 100644 index 00000000000000..f7f742d84f9434 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-20-roberta_ner_devanshrj_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English roberta_ner_devanshrj RoBertaForTokenClassification from devanshrj +author: John Snow Labs +name: roberta_ner_devanshrj +date: 2024-09-20 +tags: [en, open_source, onnx, token_classification, roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_ner_devanshrj` is a English model originally trained by devanshrj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_ner_devanshrj_en_5.5.0_3.0_1726847615829.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_ner_devanshrj_en_5.5.0_3.0_1726847615829.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = RoBertaForTokenClassification.pretrained("roberta_ner_devanshrj","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = RoBertaForTokenClassification.pretrained("roberta_ner_devanshrj", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_ner_devanshrj| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|428.2 MB| + +## References + +https://huggingface.co/devanshrj/roberta-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-20-sent_bert_base_uncased_git_zh.md b/docs/_posts/ahmedlone127/2024-09-20-sent_bert_base_uncased_git_zh.md new file mode 100644 index 00000000000000..3f12c59b2622d7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-20-sent_bert_base_uncased_git_zh.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Chinese sent_bert_base_uncased_git BertSentenceEmbeddings from littlebird13 +author: John Snow Labs +name: sent_bert_base_uncased_git +date: 2024-09-20 +tags: [zh, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: zh +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_bert_base_uncased_git` is a Chinese model originally trained by littlebird13. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_bert_base_uncased_git_zh_5.5.0_3.0_1726867226378.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_bert_base_uncased_git_zh_5.5.0_3.0_1726867226378.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_bert_base_uncased_git","zh") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_bert_base_uncased_git","zh") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_bert_base_uncased_git| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|zh| +|Size:|407.2 MB| + +## References + +https://huggingface.co/littlebird13/bert-base-uncased-git \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-20-sentience_classification_score_pytorch_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-20-sentience_classification_score_pytorch_pipeline_en.md new file mode 100644 index 00000000000000..1d5d72509480a4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-20-sentience_classification_score_pytorch_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English sentience_classification_score_pytorch_pipeline pipeline DistilBertForSequenceClassification from aeaee +author: John Snow Labs +name: sentience_classification_score_pytorch_pipeline +date: 2024-09-20 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentience_classification_score_pytorch_pipeline` is a English model originally trained by aeaee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentience_classification_score_pytorch_pipeline_en_5.5.0_3.0_1726823626901.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentience_classification_score_pytorch_pipeline_en_5.5.0_3.0_1726823626901.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sentience_classification_score_pytorch_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sentience_classification_score_pytorch_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentience_classification_score_pytorch_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/aeaee/SENTIENCE_Classification_Score_pytorch + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-20-sentiment_analysis_base_rslora_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-20-sentiment_analysis_base_rslora_pipeline_en.md new file mode 100644 index 00000000000000..1790ae2ebb5a99 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-20-sentiment_analysis_base_rslora_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English sentiment_analysis_base_rslora_pipeline pipeline RoBertaForSequenceClassification from Shotaro30678 +author: John Snow Labs +name: sentiment_analysis_base_rslora_pipeline +date: 2024-09-20 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_analysis_base_rslora_pipeline` is a English model originally trained by Shotaro30678. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_analysis_base_rslora_pipeline_en_5.5.0_3.0_1726851649597.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_analysis_base_rslora_pipeline_en_5.5.0_3.0_1726851649597.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sentiment_analysis_base_rslora_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sentiment_analysis_base_rslora_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_analysis_base_rslora_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|308.9 MB| + +## References + +https://huggingface.co/Shotaro30678/sentiment-analysis-base-rslora + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-20-trainer3b_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-20-trainer3b_pipeline_en.md new file mode 100644 index 00000000000000..a6c82d0f4c9c26 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-20-trainer3b_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English trainer3b_pipeline pipeline DistilBertForSequenceClassification from SimoneJLaudani +author: John Snow Labs +name: trainer3b_pipeline +date: 2024-09-20 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`trainer3b_pipeline` is a English model originally trained by SimoneJLaudani. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/trainer3b_pipeline_en_5.5.0_3.0_1726823850489.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/trainer3b_pipeline_en_5.5.0_3.0_1726823850489.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("trainer3b_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("trainer3b_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|trainer3b_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|410.1 MB| + +## References + +https://huggingface.co/SimoneJLaudani/trainer3b + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-21-bert_base_uncased_finetune_squad_ep_1_0_lr_0_0005_wd_0_01_dp_0_99_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-21-bert_base_uncased_finetune_squad_ep_1_0_lr_0_0005_wd_0_01_dp_0_99_pipeline_en.md new file mode 100644 index 00000000000000..e26e910a5c4639 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-21-bert_base_uncased_finetune_squad_ep_1_0_lr_0_0005_wd_0_01_dp_0_99_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bert_base_uncased_finetune_squad_ep_1_0_lr_0_0005_wd_0_01_dp_0_99_pipeline pipeline BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_finetune_squad_ep_1_0_lr_0_0005_wd_0_01_dp_0_99_pipeline +date: 2024-09-21 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetune_squad_ep_1_0_lr_0_0005_wd_0_01_dp_0_99_pipeline` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_1_0_lr_0_0005_wd_0_01_dp_0_99_pipeline_en_5.5.0_3.0_1726946344279.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_1_0_lr_0_0005_wd_0_01_dp_0_99_pipeline_en_5.5.0_3.0_1726946344279.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetune_squad_ep_1_0_lr_0_0005_wd_0_01_dp_0_99_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetune_squad_ep_1_0_lr_0_0005_wd_0_01_dp_0_99_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetune_squad_ep_1_0_lr_0_0005_wd_0_01_dp_0_99_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-finetune-squad-ep-1.0-lr-0.0005-wd-0.01-dp-0.99 + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-21-coha1860s_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-21-coha1860s_pipeline_en.md new file mode 100644 index 00000000000000..d8fb2814a00bc0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-21-coha1860s_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English coha1860s_pipeline pipeline RoBertaEmbeddings from simonmun +author: John Snow Labs +name: coha1860s_pipeline +date: 2024-09-21 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`coha1860s_pipeline` is a English model originally trained by simonmun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/coha1860s_pipeline_en_5.5.0_3.0_1726934236904.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/coha1860s_pipeline_en_5.5.0_3.0_1726934236904.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("coha1860s_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("coha1860s_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|coha1860s_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|312.0 MB| + +## References + +https://huggingface.co/simonmun/COHA1860s + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-21-distilroberta_topic_classification_3_en.md b/docs/_posts/ahmedlone127/2024-09-21-distilroberta_topic_classification_3_en.md new file mode 100644 index 00000000000000..c6333164fae4f9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-21-distilroberta_topic_classification_3_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilroberta_topic_classification_3 RoBertaForSequenceClassification from abdulmatinomotoso +author: John Snow Labs +name: distilroberta_topic_classification_3 +date: 2024-09-21 +tags: [en, open_source, onnx, sequence_classification, roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_topic_classification_3` is a English model originally trained by abdulmatinomotoso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_topic_classification_3_en_5.5.0_3.0_1726940691935.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_topic_classification_3_en_5.5.0_3.0_1726940691935.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_topic_classification_3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_topic_classification_3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_topic_classification_3| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.4 MB| + +## References + +https://huggingface.co/abdulmatinomotoso/distilroberta-topic-classification_3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-21-openai_whisper_small_zoomrx_colab_1_xx.md b/docs/_posts/ahmedlone127/2024-09-21-openai_whisper_small_zoomrx_colab_1_xx.md new file mode 100644 index 00000000000000..cf97a3d436550d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-21-openai_whisper_small_zoomrx_colab_1_xx.md @@ -0,0 +1,84 @@ +--- +layout: model +title: Multilingual openai_whisper_small_zoomrx_colab_1 WhisperForCTC from PraveenJesu +author: John Snow Labs +name: openai_whisper_small_zoomrx_colab_1 +date: 2024-09-21 +tags: [xx, open_source, onnx, asr, whisper] +task: Automatic Speech Recognition +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: WhisperForCTC +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`openai_whisper_small_zoomrx_colab_1` is a Multilingual model originally trained by PraveenJesu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/openai_whisper_small_zoomrx_colab_1_xx_5.5.0_3.0_1726939013591.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/openai_whisper_small_zoomrx_colab_1_xx_5.5.0_3.0_1726939013591.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +audioAssembler = AudioAssembler() \ + .setInputCol("audio_content") \ + .setOutputCol("audio_assembler") + +speechToText = WhisperForCTC.pretrained("openai_whisper_small_zoomrx_colab_1","xx") \ + .setInputCols(["audio_assembler"]) \ + .setOutputCol("text") + +pipeline = Pipeline().setStages([audioAssembler, speechToText]) +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val audioAssembler = new DocumentAssembler() + .setInputCols("audio_content") + .setOutputCols("audio_assembler") + +val speechToText = WhisperForCTC.pretrained("openai_whisper_small_zoomrx_colab_1", "xx") + .setInputCols(Array("audio_assembler")) + .setOutputCol("text") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, speechToText)) +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|openai_whisper_small_zoomrx_colab_1| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[audio_assembler]| +|Output Labels:|[text]| +|Language:|xx| +|Size:|1.1 GB| + +## References + +https://huggingface.co/PraveenJesu/openai-whisper-small-zoomrx-colab-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-21-roberta_base_bne_finetuned_tripadvisordomainadaptation_finetuned_e2_restmex2023_pais_en.md b/docs/_posts/ahmedlone127/2024-09-21-roberta_base_bne_finetuned_tripadvisordomainadaptation_finetuned_e2_restmex2023_pais_en.md new file mode 100644 index 00000000000000..2abb5aca74f3df --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-21-roberta_base_bne_finetuned_tripadvisordomainadaptation_finetuned_e2_restmex2023_pais_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English roberta_base_bne_finetuned_tripadvisordomainadaptation_finetuned_e2_restmex2023_pais RoBertaForSequenceClassification from vg055 +author: John Snow Labs +name: roberta_base_bne_finetuned_tripadvisordomainadaptation_finetuned_e2_restmex2023_pais +date: 2024-09-21 +tags: [en, open_source, onnx, sequence_classification, roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_finetuned_tripadvisordomainadaptation_finetuned_e2_restmex2023_pais` is a English model originally trained by vg055. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_tripadvisordomainadaptation_finetuned_e2_restmex2023_pais_en_5.5.0_3.0_1726940346829.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_tripadvisordomainadaptation_finetuned_e2_restmex2023_pais_en_5.5.0_3.0_1726940346829.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_tripadvisordomainadaptation_finetuned_e2_restmex2023_pais","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_tripadvisordomainadaptation_finetuned_e2_restmex2023_pais", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_finetuned_tripadvisordomainadaptation_finetuned_e2_restmex2023_pais| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|466.2 MB| + +## References + +https://huggingface.co/vg055/roberta-base-bne-finetuned-TripAdvisorDomainAdaptation-finetuned-e2-RestMex2023-pais \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-21-roberta_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-21-roberta_pipeline_en.md new file mode 100644 index 00000000000000..71b6c83331c3c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-21-roberta_pipeline_en.md @@ -0,0 +1,72 @@ +--- +layout: model +title: English roberta_pipeline pipeline RoBertaForTokenClassification from autosyrup +author: John Snow Labs +name: roberta_pipeline +date: 2024-09-21 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_pipeline` is a English model originally trained by autosyrup. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_pipeline_en_5.5.0_3.0_1726952872316.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_pipeline_en_5.5.0_3.0_1726952872316.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +pipeline = PretrainedPipeline("roberta_pipeline", lang = "en") +annotations = pipeline.transform(df) +``` +```scala +val pipeline = new PretrainedPipeline("roberta_pipeline", lang = "en") +val annotations = pipeline.transform(df) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +References + +https://huggingface.co/autosyrup/roberta + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-21-teamim_tiny_weightdecay_0_05_combined_data_date_17_07_2024_10_10_he.md b/docs/_posts/ahmedlone127/2024-09-21-teamim_tiny_weightdecay_0_05_combined_data_date_17_07_2024_10_10_he.md new file mode 100644 index 00000000000000..814347315d34e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-21-teamim_tiny_weightdecay_0_05_combined_data_date_17_07_2024_10_10_he.md @@ -0,0 +1,84 @@ +--- +layout: model +title: Hebrew teamim_tiny_weightdecay_0_05_combined_data_date_17_07_2024_10_10 WhisperForCTC from cantillation +author: John Snow Labs +name: teamim_tiny_weightdecay_0_05_combined_data_date_17_07_2024_10_10 +date: 2024-09-21 +tags: [he, open_source, onnx, asr, whisper] +task: Automatic Speech Recognition +language: he +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: WhisperForCTC +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`teamim_tiny_weightdecay_0_05_combined_data_date_17_07_2024_10_10` is a Hebrew model originally trained by cantillation. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/teamim_tiny_weightdecay_0_05_combined_data_date_17_07_2024_10_10_he_5.5.0_3.0_1726878969639.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/teamim_tiny_weightdecay_0_05_combined_data_date_17_07_2024_10_10_he_5.5.0_3.0_1726878969639.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +audioAssembler = AudioAssembler() \ + .setInputCol("audio_content") \ + .setOutputCol("audio_assembler") + +speechToText = WhisperForCTC.pretrained("teamim_tiny_weightdecay_0_05_combined_data_date_17_07_2024_10_10","he") \ + .setInputCols(["audio_assembler"]) \ + .setOutputCol("text") + +pipeline = Pipeline().setStages([audioAssembler, speechToText]) +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val audioAssembler = new DocumentAssembler() + .setInputCols("audio_content") + .setOutputCols("audio_assembler") + +val speechToText = WhisperForCTC.pretrained("teamim_tiny_weightdecay_0_05_combined_data_date_17_07_2024_10_10", "he") + .setInputCols(Array("audio_assembler")) + .setOutputCol("text") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, speechToText)) +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|teamim_tiny_weightdecay_0_05_combined_data_date_17_07_2024_10_10| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[audio_assembler]| +|Output Labels:|[text]| +|Language:|he| +|Size:|388.7 MB| + +## References + +https://huggingface.co/cantillation/Teamim-tiny_WeightDecay-0.05_Combined-Data_date-17-07-2024_10-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-21-tiny_english_combined_v4_1_0_32_1e_06_cool_sweep_12_en.md b/docs/_posts/ahmedlone127/2024-09-21-tiny_english_combined_v4_1_0_32_1e_06_cool_sweep_12_en.md new file mode 100644 index 00000000000000..8f8f3585f3c0c0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-21-tiny_english_combined_v4_1_0_32_1e_06_cool_sweep_12_en.md @@ -0,0 +1,84 @@ +--- +layout: model +title: English tiny_english_combined_v4_1_0_32_1e_06_cool_sweep_12 WhisperForCTC from saahith +author: John Snow Labs +name: tiny_english_combined_v4_1_0_32_1e_06_cool_sweep_12 +date: 2024-09-21 +tags: [en, open_source, onnx, asr, whisper] +task: Automatic Speech Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: WhisperForCTC +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_english_combined_v4_1_0_32_1e_06_cool_sweep_12` is a English model originally trained by saahith. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_english_combined_v4_1_0_32_1e_06_cool_sweep_12_en_5.5.0_3.0_1726908295806.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_english_combined_v4_1_0_32_1e_06_cool_sweep_12_en_5.5.0_3.0_1726908295806.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +audioAssembler = AudioAssembler() \ + .setInputCol("audio_content") \ + .setOutputCol("audio_assembler") + +speechToText = WhisperForCTC.pretrained("tiny_english_combined_v4_1_0_32_1e_06_cool_sweep_12","en") \ + .setInputCols(["audio_assembler"]) \ + .setOutputCol("text") + +pipeline = Pipeline().setStages([audioAssembler, speechToText]) +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val audioAssembler = new DocumentAssembler() + .setInputCols("audio_content") + .setOutputCols("audio_assembler") + +val speechToText = WhisperForCTC.pretrained("tiny_english_combined_v4_1_0_32_1e_06_cool_sweep_12", "en") + .setInputCols(Array("audio_assembler")) + .setOutputCol("text") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, speechToText)) +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_english_combined_v4_1_0_32_1e_06_cool_sweep_12| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[audio_assembler]| +|Output Labels:|[text]| +|Language:|en| +|Size:|391.3 MB| + +## References + +https://huggingface.co/saahith/tiny.en-combined_v4-1-0-32-1e-06-cool-sweep-12 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-21-tiny_english_emsassist_2_1_0_16_1e_05_eager_sweep_4_en.md b/docs/_posts/ahmedlone127/2024-09-21-tiny_english_emsassist_2_1_0_16_1e_05_eager_sweep_4_en.md new file mode 100644 index 00000000000000..b5289efbcac767 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-21-tiny_english_emsassist_2_1_0_16_1e_05_eager_sweep_4_en.md @@ -0,0 +1,84 @@ +--- +layout: model +title: English tiny_english_emsassist_2_1_0_16_1e_05_eager_sweep_4 WhisperForCTC from saahith +author: John Snow Labs +name: tiny_english_emsassist_2_1_0_16_1e_05_eager_sweep_4 +date: 2024-09-21 +tags: [en, open_source, onnx, asr, whisper] +task: Automatic Speech Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: WhisperForCTC +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_english_emsassist_2_1_0_16_1e_05_eager_sweep_4` is a English model originally trained by saahith. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_english_emsassist_2_1_0_16_1e_05_eager_sweep_4_en_5.5.0_3.0_1726903295470.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_english_emsassist_2_1_0_16_1e_05_eager_sweep_4_en_5.5.0_3.0_1726903295470.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +audioAssembler = AudioAssembler() \ + .setInputCol("audio_content") \ + .setOutputCol("audio_assembler") + +speechToText = WhisperForCTC.pretrained("tiny_english_emsassist_2_1_0_16_1e_05_eager_sweep_4","en") \ + .setInputCols(["audio_assembler"]) \ + .setOutputCol("text") + +pipeline = Pipeline().setStages([audioAssembler, speechToText]) +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val audioAssembler = new DocumentAssembler() + .setInputCols("audio_content") + .setOutputCols("audio_assembler") + +val speechToText = WhisperForCTC.pretrained("tiny_english_emsassist_2_1_0_16_1e_05_eager_sweep_4", "en") + .setInputCols(Array("audio_assembler")) + .setOutputCol("text") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, speechToText)) +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_english_emsassist_2_1_0_16_1e_05_eager_sweep_4| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[audio_assembler]| +|Output Labels:|[text]| +|Language:|en| +|Size:|394.3 MB| + +## References + +https://huggingface.co/saahith/tiny.en-EMSAssist-2-1-0-16-1e-05-eager-sweep-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-21-turkish_lyric_tonga_tonga_islands_genre_tr.md b/docs/_posts/ahmedlone127/2024-09-21-turkish_lyric_tonga_tonga_islands_genre_tr.md new file mode 100644 index 00000000000000..3a289432e15608 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-21-turkish_lyric_tonga_tonga_islands_genre_tr.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Turkish turkish_lyric_tonga_tonga_islands_genre BertForSequenceClassification from Veucci +author: John Snow Labs +name: turkish_lyric_tonga_tonga_islands_genre +date: 2024-09-21 +tags: [tr, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: tr +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`turkish_lyric_tonga_tonga_islands_genre` is a Turkish model originally trained by Veucci. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/turkish_lyric_tonga_tonga_islands_genre_tr_5.5.0_3.0_1726902746886.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/turkish_lyric_tonga_tonga_islands_genre_tr_5.5.0_3.0_1726902746886.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("turkish_lyric_tonga_tonga_islands_genre","tr") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("turkish_lyric_tonga_tonga_islands_genre", "tr") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|turkish_lyric_tonga_tonga_islands_genre| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|tr| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Veucci/turkish-lyric-to-genre \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-21-whisper_base_malayalam_redw0rm_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-21-whisper_base_malayalam_redw0rm_pipeline_en.md new file mode 100644 index 00000000000000..d720afd1b55daf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-21-whisper_base_malayalam_redw0rm_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English whisper_base_malayalam_redw0rm_pipeline pipeline WhisperForCTC from redw0rm +author: John Snow Labs +name: whisper_base_malayalam_redw0rm_pipeline +date: 2024-09-21 +tags: [en, open_source, pipeline, onnx] +task: Automatic Speech Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`whisper_base_malayalam_redw0rm_pipeline` is a English model originally trained by redw0rm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/whisper_base_malayalam_redw0rm_pipeline_en_5.5.0_3.0_1726950658190.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/whisper_base_malayalam_redw0rm_pipeline_en_5.5.0_3.0_1726950658190.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("whisper_base_malayalam_redw0rm_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("whisper_base_malayalam_redw0rm_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|whisper_base_malayalam_redw0rm_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|644.1 MB| + +## References + +https://huggingface.co/redw0rm/whisper-base-ml + +## Included Models + +- AudioAssembler +- WhisperForCTC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-21-whisper_small_ga2en_v1_2_r_pipeline_ga.md b/docs/_posts/ahmedlone127/2024-09-21-whisper_small_ga2en_v1_2_r_pipeline_ga.md new file mode 100644 index 00000000000000..924287b9f04a0f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-21-whisper_small_ga2en_v1_2_r_pipeline_ga.md @@ -0,0 +1,69 @@ +--- +layout: model +title: Irish whisper_small_ga2en_v1_2_r_pipeline pipeline WhisperForCTC from ymoslem +author: John Snow Labs +name: whisper_small_ga2en_v1_2_r_pipeline +date: 2024-09-21 +tags: [ga, open_source, pipeline, onnx] +task: Automatic Speech Recognition +language: ga +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`whisper_small_ga2en_v1_2_r_pipeline` is a Irish model originally trained by ymoslem. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/whisper_small_ga2en_v1_2_r_pipeline_ga_5.5.0_3.0_1726937558716.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/whisper_small_ga2en_v1_2_r_pipeline_ga_5.5.0_3.0_1726937558716.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("whisper_small_ga2en_v1_2_r_pipeline", lang = "ga") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("whisper_small_ga2en_v1_2_r_pipeline", lang = "ga") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|whisper_small_ga2en_v1_2_r_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|ga| +|Size:|1.7 GB| + +## References + +https://huggingface.co/ymoslem/whisper-small-ga2en-v1.2-r + +## Included Models + +- AudioAssembler +- WhisperForCTC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-21-whisper_small_igbo_jamese360_ig.md b/docs/_posts/ahmedlone127/2024-09-21-whisper_small_igbo_jamese360_ig.md new file mode 100644 index 00000000000000..2e1faca3bc51b1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-21-whisper_small_igbo_jamese360_ig.md @@ -0,0 +1,84 @@ +--- +layout: model +title: Igbo whisper_small_igbo_jamese360 WhisperForCTC from jamese360 +author: John Snow Labs +name: whisper_small_igbo_jamese360 +date: 2024-09-21 +tags: [ig, open_source, onnx, asr, whisper] +task: Automatic Speech Recognition +language: ig +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: WhisperForCTC +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`whisper_small_igbo_jamese360` is a Igbo model originally trained by jamese360. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/whisper_small_igbo_jamese360_ig_5.5.0_3.0_1726892318017.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/whisper_small_igbo_jamese360_ig_5.5.0_3.0_1726892318017.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +audioAssembler = AudioAssembler() \ + .setInputCol("audio_content") \ + .setOutputCol("audio_assembler") + +speechToText = WhisperForCTC.pretrained("whisper_small_igbo_jamese360","ig") \ + .setInputCols(["audio_assembler"]) \ + .setOutputCol("text") + +pipeline = Pipeline().setStages([audioAssembler, speechToText]) +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val audioAssembler = new DocumentAssembler() + .setInputCols("audio_content") + .setOutputCols("audio_assembler") + +val speechToText = WhisperForCTC.pretrained("whisper_small_igbo_jamese360", "ig") + .setInputCols(Array("audio_assembler")) + .setOutputCol("text") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, speechToText)) +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|whisper_small_igbo_jamese360| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[audio_assembler]| +|Output Labels:|[text]| +|Language:|ig| +|Size:|1.7 GB| + +## References + +https://huggingface.co/jamese360/whisper-small-ig \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-21-whisper_small_javanese_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-21-whisper_small_javanese_pipeline_en.md new file mode 100644 index 00000000000000..1108f3029435f6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-21-whisper_small_javanese_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English whisper_small_javanese_pipeline pipeline WhisperForCTC from Rizka +author: John Snow Labs +name: whisper_small_javanese_pipeline +date: 2024-09-21 +tags: [en, open_source, pipeline, onnx] +task: Automatic Speech Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`whisper_small_javanese_pipeline` is a English model originally trained by Rizka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/whisper_small_javanese_pipeline_en_5.5.0_3.0_1726936884345.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/whisper_small_javanese_pipeline_en_5.5.0_3.0_1726936884345.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("whisper_small_javanese_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("whisper_small_javanese_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|whisper_small_javanese_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.7 GB| + +## References + +https://huggingface.co/Rizka/whisper-small-jv + +## Included Models + +- AudioAssembler +- WhisperForCTC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-21-whisper_small_vasi001_pipeline_hi.md b/docs/_posts/ahmedlone127/2024-09-21-whisper_small_vasi001_pipeline_hi.md new file mode 100644 index 00000000000000..5906cf8aabb558 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-21-whisper_small_vasi001_pipeline_hi.md @@ -0,0 +1,69 @@ +--- +layout: model +title: Hindi whisper_small_vasi001_pipeline pipeline WhisperForCTC from Vasi001 +author: John Snow Labs +name: whisper_small_vasi001_pipeline +date: 2024-09-21 +tags: [hi, open_source, pipeline, onnx] +task: Automatic Speech Recognition +language: hi +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`whisper_small_vasi001_pipeline` is a Hindi model originally trained by Vasi001. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/whisper_small_vasi001_pipeline_hi_5.5.0_3.0_1726937731186.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/whisper_small_vasi001_pipeline_hi_5.5.0_3.0_1726937731186.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("whisper_small_vasi001_pipeline", lang = "hi") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("whisper_small_vasi001_pipeline", lang = "hi") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|whisper_small_vasi001_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|hi| +|Size:|1.7 GB| + +## References + +https://huggingface.co/Vasi001/whisper-small + +## Included Models + +- AudioAssembler +- WhisperForCTC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-all_roberta_large_v1_home_1000_16_5_oos_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-22-all_roberta_large_v1_home_1000_16_5_oos_pipeline_en.md new file mode 100644 index 00000000000000..2ce208bc7ccc6e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-all_roberta_large_v1_home_1000_16_5_oos_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English all_roberta_large_v1_home_1000_16_5_oos_pipeline pipeline RoBertaForSequenceClassification from fathyshalab +author: John Snow Labs +name: all_roberta_large_v1_home_1000_16_5_oos_pipeline +date: 2024-09-22 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_roberta_large_v1_home_1000_16_5_oos_pipeline` is a English model originally trained by fathyshalab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_home_1000_16_5_oos_pipeline_en_5.5.0_3.0_1727037032494.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_home_1000_16_5_oos_pipeline_en_5.5.0_3.0_1727037032494.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("all_roberta_large_v1_home_1000_16_5_oos_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("all_roberta_large_v1_home_1000_16_5_oos_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_roberta_large_v1_home_1000_16_5_oos_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/fathyshalab/all-roberta-large-v1-home-1000-16-5-oos + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-bert_base_uncased_ep_1_29_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_false_fh_false_hs_300_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-22-bert_base_uncased_ep_1_29_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_false_fh_false_hs_300_pipeline_en.md new file mode 100644 index 00000000000000..9945e9c3782dfa --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-bert_base_uncased_ep_1_29_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_false_fh_false_hs_300_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bert_base_uncased_ep_1_29_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_false_fh_false_hs_300_pipeline pipeline BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_ep_1_29_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_false_fh_false_hs_300_pipeline +date: 2024-09-22 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_ep_1_29_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_false_fh_false_hs_300_pipeline` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_ep_1_29_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_false_fh_false_hs_300_pipeline_en_5.5.0_3.0_1727042908813.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_ep_1_29_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_false_fh_false_hs_300_pipeline_en_5.5.0_3.0_1727042908813.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_ep_1_29_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_false_fh_false_hs_300_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_ep_1_29_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_false_fh_false_hs_300_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_ep_1_29_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_false_fh_false_hs_300_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-ep-1.29-b-32-lr-8e-07-dp-0.5-ss-0-st-False-fh-False-hs-300 + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-bert_large_portuguese_cased_assin2_entailment_pt.md b/docs/_posts/ahmedlone127/2024-09-22-bert_large_portuguese_cased_assin2_entailment_pt.md new file mode 100644 index 00000000000000..b9d588c05cc755 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-bert_large_portuguese_cased_assin2_entailment_pt.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Portuguese bert_large_portuguese_cased_assin2_entailment BertForSequenceClassification from ruanchaves +author: John Snow Labs +name: bert_large_portuguese_cased_assin2_entailment +date: 2024-09-22 +tags: [pt, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: pt +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_portuguese_cased_assin2_entailment` is a Portuguese model originally trained by ruanchaves. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_portuguese_cased_assin2_entailment_pt_5.5.0_3.0_1727032489823.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_portuguese_cased_assin2_entailment_pt_5.5.0_3.0_1727032489823.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_portuguese_cased_assin2_entailment","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_portuguese_cased_assin2_entailment", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_portuguese_cased_assin2_entailment| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|pt| +|Size:|1.3 GB| + +## References + +https://huggingface.co/ruanchaves/bert-large-portuguese-cased-assin2-entailment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-brand_classification_20240708_model_2_distilbert_0_980011_en.md b/docs/_posts/ahmedlone127/2024-09-22-brand_classification_20240708_model_2_distilbert_0_980011_en.md new file mode 100644 index 00000000000000..148874f3341d01 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-brand_classification_20240708_model_2_distilbert_0_980011_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English brand_classification_20240708_model_2_distilbert_0_980011 DistilBertForSequenceClassification from jointriple +author: John Snow Labs +name: brand_classification_20240708_model_2_distilbert_0_980011 +date: 2024-09-22 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`brand_classification_20240708_model_2_distilbert_0_980011` is a English model originally trained by jointriple. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/brand_classification_20240708_model_2_distilbert_0_980011_en_5.5.0_3.0_1727012231653.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/brand_classification_20240708_model_2_distilbert_0_980011_en_5.5.0_3.0_1727012231653.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("brand_classification_20240708_model_2_distilbert_0_980011","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("brand_classification_20240708_model_2_distilbert_0_980011", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|brand_classification_20240708_model_2_distilbert_0_980011| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|255.5 MB| + +## References + +https://huggingface.co/jointriple/brand_classification_20240708_model_2_distilbert_0_980011 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-burmese_awesome_eli5_mlm_model_cecilia0409_en.md b/docs/_posts/ahmedlone127/2024-09-22-burmese_awesome_eli5_mlm_model_cecilia0409_en.md new file mode 100644 index 00000000000000..c155db84031297 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-burmese_awesome_eli5_mlm_model_cecilia0409_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English burmese_awesome_eli5_mlm_model_cecilia0409 RoBertaEmbeddings from Cecilia0409 +author: John Snow Labs +name: burmese_awesome_eli5_mlm_model_cecilia0409 +date: 2024-09-22 +tags: [en, open_source, onnx, embeddings, roberta] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_eli5_mlm_model_cecilia0409` is a English model originally trained by Cecilia0409. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_eli5_mlm_model_cecilia0409_en_5.5.0_3.0_1727000018309.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_eli5_mlm_model_cecilia0409_en_5.5.0_3.0_1727000018309.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = RoBertaEmbeddings.pretrained("burmese_awesome_eli5_mlm_model_cecilia0409","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = RoBertaEmbeddings.pretrained("burmese_awesome_eli5_mlm_model_cecilia0409","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_eli5_mlm_model_cecilia0409| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[roberta]| +|Language:|en| +|Size:|306.5 MB| + +## References + +https://huggingface.co/Cecilia0409/my_awesome_eli5_mlm_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-burmese_awesome_eli5_mlm_model_philander_en.md b/docs/_posts/ahmedlone127/2024-09-22-burmese_awesome_eli5_mlm_model_philander_en.md new file mode 100644 index 00000000000000..9dcd2c3c5898d9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-burmese_awesome_eli5_mlm_model_philander_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English burmese_awesome_eli5_mlm_model_philander RoBertaEmbeddings from PHILANDER +author: John Snow Labs +name: burmese_awesome_eli5_mlm_model_philander +date: 2024-09-22 +tags: [en, open_source, onnx, embeddings, roberta] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_eli5_mlm_model_philander` is a English model originally trained by PHILANDER. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_eli5_mlm_model_philander_en_5.5.0_3.0_1727041615188.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_eli5_mlm_model_philander_en_5.5.0_3.0_1727041615188.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = RoBertaEmbeddings.pretrained("burmese_awesome_eli5_mlm_model_philander","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = RoBertaEmbeddings.pretrained("burmese_awesome_eli5_mlm_model_philander","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_eli5_mlm_model_philander| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[roberta]| +|Language:|en| +|Size:|306.5 MB| + +## References + +https://huggingface.co/PHILANDER/my_awesome_eli5_mlm_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-burmese_awesome_model_mou11209203_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-22-burmese_awesome_model_mou11209203_pipeline_en.md new file mode 100644 index 00000000000000..68cdd963f1549e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-burmese_awesome_model_mou11209203_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English burmese_awesome_model_mou11209203_pipeline pipeline DistilBertForSequenceClassification from Mou11209203 +author: John Snow Labs +name: burmese_awesome_model_mou11209203_pipeline +date: 2024-09-22 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_model_mou11209203_pipeline` is a English model originally trained by Mou11209203. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_mou11209203_pipeline_en_5.5.0_3.0_1727012979332.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_mou11209203_pipeline_en_5.5.0_3.0_1727012979332.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("burmese_awesome_model_mou11209203_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("burmese_awesome_model_mou11209203_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_model_mou11209203_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Mou11209203/my_awesome_model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-chatgpt_essay_llms_en.md b/docs/_posts/ahmedlone127/2024-09-22-chatgpt_essay_llms_en.md new file mode 100644 index 00000000000000..3d61eef24417ac --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-chatgpt_essay_llms_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English chatgpt_essay_llms DistilBertForSequenceClassification from huyen89 +author: John Snow Labs +name: chatgpt_essay_llms +date: 2024-09-22 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chatgpt_essay_llms` is a English model originally trained by huyen89. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chatgpt_essay_llms_en_5.5.0_3.0_1726980693273.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chatgpt_essay_llms_en_5.5.0_3.0_1726980693273.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("chatgpt_essay_llms","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("chatgpt_essay_llms", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chatgpt_essay_llms| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/huyen89/ChatGPT-Essay_LLMs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-clasificadormotivomora_distilbert_en.md b/docs/_posts/ahmedlone127/2024-09-22-clasificadormotivomora_distilbert_en.md new file mode 100644 index 00000000000000..45901591355ed3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-clasificadormotivomora_distilbert_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English clasificadormotivomora_distilbert DistilBertForSequenceClassification from Arodrigo +author: John Snow Labs +name: clasificadormotivomora_distilbert +date: 2024-09-22 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clasificadormotivomora_distilbert` is a English model originally trained by Arodrigo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clasificadormotivomora_distilbert_en_5.5.0_3.0_1727035493766.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clasificadormotivomora_distilbert_en_5.5.0_3.0_1727035493766.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("clasificadormotivomora_distilbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("clasificadormotivomora_distilbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clasificadormotivomora_distilbert| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Arodrigo/ClasificadorMotivoMora-Distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-darkstar_bert_ome1_en.md b/docs/_posts/ahmedlone127/2024-09-22-darkstar_bert_ome1_en.md new file mode 100644 index 00000000000000..6273d2dc6da3ad --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-darkstar_bert_ome1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English darkstar_bert_ome1 BertForSequenceClassification from Schmitz005 +author: John Snow Labs +name: darkstar_bert_ome1 +date: 2024-09-22 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`darkstar_bert_ome1` is a English model originally trained by Schmitz005. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/darkstar_bert_ome1_en_5.5.0_3.0_1726990806523.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/darkstar_bert_ome1_en_5.5.0_3.0_1726990806523.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("darkstar_bert_ome1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("darkstar_bert_ome1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|darkstar_bert_ome1| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Schmitz005/Darkstar-Bert-ome1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-distilbert_base_uncased_finetuned_cola_artoriasxv_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-22-distilbert_base_uncased_finetuned_cola_artoriasxv_pipeline_en.md new file mode 100644 index 00000000000000..7b790b2f0cfd82 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-distilbert_base_uncased_finetuned_cola_artoriasxv_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_artoriasxv_pipeline pipeline DistilBertForSequenceClassification from ArtoriasXV +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_artoriasxv_pipeline +date: 2024-09-22 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_artoriasxv_pipeline` is a English model originally trained by ArtoriasXV. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_artoriasxv_pipeline_en_5.5.0_3.0_1727012464582.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_artoriasxv_pipeline_en_5.5.0_3.0_1727012464582.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_cola_artoriasxv_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_cola_artoriasxv_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_artoriasxv_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ArtoriasXV/distilbert-base-uncased-finetuned-cola + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-distilbert_food_hightensan_en.md b/docs/_posts/ahmedlone127/2024-09-22-distilbert_food_hightensan_en.md new file mode 100644 index 00000000000000..0808dd99ec1aba --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-distilbert_food_hightensan_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_food_hightensan DistilBertForSequenceClassification from hightensan +author: John Snow Labs +name: distilbert_food_hightensan +date: 2024-09-22 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_food_hightensan` is a English model originally trained by hightensan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_food_hightensan_en_5.5.0_3.0_1727020613845.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_food_hightensan_en_5.5.0_3.0_1727020613845.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_food_hightensan","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_food_hightensan", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_food_hightensan| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/hightensan/distilbert-food \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-distilbert_turkish_turkish_spam_email_tr.md b/docs/_posts/ahmedlone127/2024-09-22-distilbert_turkish_turkish_spam_email_tr.md new file mode 100644 index 00000000000000..7150ef5b6db751 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-distilbert_turkish_turkish_spam_email_tr.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Turkish distilbert_turkish_turkish_spam_email DistilBertForSequenceClassification from anilguven +author: John Snow Labs +name: distilbert_turkish_turkish_spam_email +date: 2024-09-22 +tags: [tr, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: tr +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_turkish_turkish_spam_email` is a Turkish model originally trained by anilguven. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_turkish_turkish_spam_email_tr_5.5.0_3.0_1727020393280.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_turkish_turkish_spam_email_tr_5.5.0_3.0_1727020393280.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_turkish_turkish_spam_email","tr") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_turkish_turkish_spam_email", "tr") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_turkish_turkish_spam_email| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|tr| +|Size:|254.1 MB| + +## References + +https://huggingface.co/anilguven/distilbert_tr_turkish_spam_email \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-distilroberta_base_mrpc_glue_kevinvelez18_en.md b/docs/_posts/ahmedlone127/2024-09-22-distilroberta_base_mrpc_glue_kevinvelez18_en.md new file mode 100644 index 00000000000000..e20b99ac5ea7b6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-distilroberta_base_mrpc_glue_kevinvelez18_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilroberta_base_mrpc_glue_kevinvelez18 RoBertaForSequenceClassification from kevinvelez18 +author: John Snow Labs +name: distilroberta_base_mrpc_glue_kevinvelez18 +date: 2024-09-22 +tags: [en, open_source, onnx, sequence_classification, roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_base_mrpc_glue_kevinvelez18` is a English model originally trained by kevinvelez18. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_base_mrpc_glue_kevinvelez18_en_5.5.0_3.0_1726967970139.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_base_mrpc_glue_kevinvelez18_en_5.5.0_3.0_1726967970139.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_mrpc_glue_kevinvelez18","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_mrpc_glue_kevinvelez18", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_base_mrpc_glue_kevinvelez18| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| + +## References + +https://huggingface.co/kevinvelez18/distilroberta-base-mrpc-glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-distilroberta_finetuned_financial_text_regression_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-22-distilroberta_finetuned_financial_text_regression_pipeline_en.md new file mode 100644 index 00000000000000..bc8692eec4794f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-distilroberta_finetuned_financial_text_regression_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilroberta_finetuned_financial_text_regression_pipeline pipeline RoBertaForSequenceClassification from lwat64 +author: John Snow Labs +name: distilroberta_finetuned_financial_text_regression_pipeline +date: 2024-09-22 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_finetuned_financial_text_regression_pipeline` is a English model originally trained by lwat64. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_finetuned_financial_text_regression_pipeline_en_5.5.0_3.0_1726972150024.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_finetuned_financial_text_regression_pipeline_en_5.5.0_3.0_1726972150024.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilroberta_finetuned_financial_text_regression_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilroberta_finetuned_financial_text_regression_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_finetuned_financial_text_regression_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|308.7 MB| + +## References + +https://huggingface.co/lwat64/distilroberta-finetuned-financial-text-regression + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-finetuning_sentiment_model_3000_samples_neo111x_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-22-finetuning_sentiment_model_3000_samples_neo111x_pipeline_en.md new file mode 100644 index 00000000000000..fd3d8a8b09eb34 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-finetuning_sentiment_model_3000_samples_neo111x_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_neo111x_pipeline pipeline DistilBertForSequenceClassification from Neo111x +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_neo111x_pipeline +date: 2024-09-22 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_neo111x_pipeline` is a English model originally trained by Neo111x. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_neo111x_pipeline_en_5.5.0_3.0_1727020409265.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_neo111x_pipeline_en_5.5.0_3.0_1727020409265.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("finetuning_sentiment_model_3000_samples_neo111x_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("finetuning_sentiment_model_3000_samples_neo111x_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_neo111x_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Neo111x/finetuning-sentiment-model-3000-samples + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-ft_distilbert_base_uncased_nlp_feup_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-22-ft_distilbert_base_uncased_nlp_feup_pipeline_en.md new file mode 100644 index 00000000000000..d2505355ab9a82 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-ft_distilbert_base_uncased_nlp_feup_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English ft_distilbert_base_uncased_nlp_feup_pipeline pipeline DistilBertForSequenceClassification from NLP-FEUP +author: John Snow Labs +name: ft_distilbert_base_uncased_nlp_feup_pipeline +date: 2024-09-22 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ft_distilbert_base_uncased_nlp_feup_pipeline` is a English model originally trained by NLP-FEUP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ft_distilbert_base_uncased_nlp_feup_pipeline_en_5.5.0_3.0_1727035525332.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ft_distilbert_base_uncased_nlp_feup_pipeline_en_5.5.0_3.0_1727035525332.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("ft_distilbert_base_uncased_nlp_feup_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("ft_distilbert_base_uncased_nlp_feup_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ft_distilbert_base_uncased_nlp_feup_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/NLP-FEUP/FT-distilbert-base-uncased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-hello_classification_model_en.md b/docs/_posts/ahmedlone127/2024-09-22-hello_classification_model_en.md new file mode 100644 index 00000000000000..0d0b89c7504dbe --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-hello_classification_model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English hello_classification_model DistilBertForSequenceClassification from krishnareddy +author: John Snow Labs +name: hello_classification_model +date: 2024-09-22 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hello_classification_model` is a English model originally trained by krishnareddy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hello_classification_model_en_5.5.0_3.0_1727012455670.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hello_classification_model_en_5.5.0_3.0_1727012455670.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("hello_classification_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("hello_classification_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hello_classification_model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/krishnareddy/hello_classification_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-imdb4_en.md b/docs/_posts/ahmedlone127/2024-09-22-imdb4_en.md new file mode 100644 index 00000000000000..19a9e5fda9ffc9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-imdb4_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English imdb4 BertForSequenceClassification from Lumos +author: John Snow Labs +name: imdb4 +date: 2024-09-22 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`imdb4` is a English model originally trained by Lumos. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/imdb4_en_5.5.0_3.0_1727032501843.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/imdb4_en_5.5.0_3.0_1727032501843.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("imdb4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("imdb4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|imdb4| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Lumos/imdb4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-imdbreviews_classification_roberta_v02_clf_finetuning_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-22-imdbreviews_classification_roberta_v02_clf_finetuning_pipeline_en.md new file mode 100644 index 00000000000000..8127e33f17d98c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-imdbreviews_classification_roberta_v02_clf_finetuning_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English imdbreviews_classification_roberta_v02_clf_finetuning_pipeline pipeline RoBertaForSequenceClassification from darmendarizp +author: John Snow Labs +name: imdbreviews_classification_roberta_v02_clf_finetuning_pipeline +date: 2024-09-22 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`imdbreviews_classification_roberta_v02_clf_finetuning_pipeline` is a English model originally trained by darmendarizp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/imdbreviews_classification_roberta_v02_clf_finetuning_pipeline_en_5.5.0_3.0_1727017321135.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/imdbreviews_classification_roberta_v02_clf_finetuning_pipeline_en_5.5.0_3.0_1727017321135.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("imdbreviews_classification_roberta_v02_clf_finetuning_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("imdbreviews_classification_roberta_v02_clf_finetuning_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|imdbreviews_classification_roberta_v02_clf_finetuning_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/darmendarizp/imdbreviews_classification_roberta_v02_clf_finetuning + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-minilmv2_l6_h384_mlm_multi_emails_hq_en.md b/docs/_posts/ahmedlone127/2024-09-22-minilmv2_l6_h384_mlm_multi_emails_hq_en.md new file mode 100644 index 00000000000000..5a3b84a8ca4077 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-minilmv2_l6_h384_mlm_multi_emails_hq_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English minilmv2_l6_h384_mlm_multi_emails_hq RoBertaEmbeddings from postbot +author: John Snow Labs +name: minilmv2_l6_h384_mlm_multi_emails_hq +date: 2024-09-22 +tags: [en, open_source, onnx, embeddings, roberta] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`minilmv2_l6_h384_mlm_multi_emails_hq` is a English model originally trained by postbot. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/minilmv2_l6_h384_mlm_multi_emails_hq_en_5.5.0_3.0_1727042048476.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/minilmv2_l6_h384_mlm_multi_emails_hq_en_5.5.0_3.0_1727042048476.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = RoBertaEmbeddings.pretrained("minilmv2_l6_h384_mlm_multi_emails_hq","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = RoBertaEmbeddings.pretrained("minilmv2_l6_h384_mlm_multi_emails_hq","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|minilmv2_l6_h384_mlm_multi_emails_hq| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[roberta]| +|Language:|en| +|Size:|114.2 MB| + +## References + +https://huggingface.co/postbot/MiniLMv2-L6-H384-mlm-multi-emails-hq \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-persian_text_emotion_bert_v1_fa.md b/docs/_posts/ahmedlone127/2024-09-22-persian_text_emotion_bert_v1_fa.md new file mode 100644 index 00000000000000..baaaa7100a29a7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-persian_text_emotion_bert_v1_fa.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Persian persian_text_emotion_bert_v1 BertForSequenceClassification from SeyedAli +author: John Snow Labs +name: persian_text_emotion_bert_v1 +date: 2024-09-22 +tags: [fa, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: fa +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`persian_text_emotion_bert_v1` is a Persian model originally trained by SeyedAli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/persian_text_emotion_bert_v1_fa_5.5.0_3.0_1726988660120.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/persian_text_emotion_bert_v1_fa_5.5.0_3.0_1726988660120.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("persian_text_emotion_bert_v1","fa") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("persian_text_emotion_bert_v1", "fa") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|persian_text_emotion_bert_v1| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|fa| +|Size:|608.7 MB| + +## References + +https://huggingface.co/SeyedAli/Persian-Text-Emotion-Bert-V1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-roberta_base_finetuned_sleevelength_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-22-roberta_base_finetuned_sleevelength_pipeline_en.md new file mode 100644 index 00000000000000..49dcdf5f5000db --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-roberta_base_finetuned_sleevelength_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English roberta_base_finetuned_sleevelength_pipeline pipeline RoBertaForSequenceClassification from Cournane +author: John Snow Labs +name: roberta_base_finetuned_sleevelength_pipeline +date: 2024-09-22 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_finetuned_sleevelength_pipeline` is a English model originally trained by Cournane. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_sleevelength_pipeline_en_5.5.0_3.0_1727036931911.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_sleevelength_pipeline_en_5.5.0_3.0_1727036931911.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("roberta_base_finetuned_sleevelength_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("roberta_base_finetuned_sleevelength_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_finetuned_sleevelength_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|427.2 MB| + +## References + +https://huggingface.co/Cournane/roberta-base-finetuned-SleeveLength + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-sent_bert_base_finnish_europeana_cased_en.md b/docs/_posts/ahmedlone127/2024-09-22-sent_bert_base_finnish_europeana_cased_en.md new file mode 100644 index 00000000000000..0460d492ccdfba --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-sent_bert_base_finnish_europeana_cased_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sent_bert_base_finnish_europeana_cased BertSentenceEmbeddings from dbmdz +author: John Snow Labs +name: sent_bert_base_finnish_europeana_cased +date: 2024-09-22 +tags: [en, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_bert_base_finnish_europeana_cased` is a English model originally trained by dbmdz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_bert_base_finnish_europeana_cased_en_5.5.0_3.0_1727013591575.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_bert_base_finnish_europeana_cased_en_5.5.0_3.0_1727013591575.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_bert_base_finnish_europeana_cased","en") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_bert_base_finnish_europeana_cased","en") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_bert_base_finnish_europeana_cased| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|411.3 MB| + +## References + +https://huggingface.co/dbmdz/bert-base-finnish-europeana-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-sharif_pors_bert_base_sharif_qa_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-22-sharif_pors_bert_base_sharif_qa_pipeline_en.md new file mode 100644 index 00000000000000..85c36d93cdf53c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-sharif_pors_bert_base_sharif_qa_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English sharif_pors_bert_base_sharif_qa_pipeline pipeline BertForQuestionAnswering from parsi-ai-nlpclass +author: John Snow Labs +name: sharif_pors_bert_base_sharif_qa_pipeline +date: 2024-09-22 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sharif_pors_bert_base_sharif_qa_pipeline` is a English model originally trained by parsi-ai-nlpclass. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sharif_pors_bert_base_sharif_qa_pipeline_en_5.5.0_3.0_1727042513072.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sharif_pors_bert_base_sharif_qa_pipeline_en_5.5.0_3.0_1727042513072.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sharif_pors_bert_base_sharif_qa_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sharif_pors_bert_base_sharif_qa_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sharif_pors_bert_base_sharif_qa_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|606.5 MB| + +## References + +https://huggingface.co/parsi-ai-nlpclass/Sharif-Pors-bert-base-sharif-qa + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-tapp_multilabel_climatebert_f_en.md b/docs/_posts/ahmedlone127/2024-09-22-tapp_multilabel_climatebert_f_en.md new file mode 100644 index 00000000000000..08ad1bb1a9be1d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-tapp_multilabel_climatebert_f_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English tapp_multilabel_climatebert_f RoBertaForSequenceClassification from GIZ +author: John Snow Labs +name: tapp_multilabel_climatebert_f +date: 2024-09-22 +tags: [en, open_source, onnx, sequence_classification, roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tapp_multilabel_climatebert_f` is a English model originally trained by GIZ. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tapp_multilabel_climatebert_f_en_5.5.0_3.0_1726972192416.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tapp_multilabel_climatebert_f_en_5.5.0_3.0_1726972192416.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("tapp_multilabel_climatebert_f","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("tapp_multilabel_climatebert_f", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tapp_multilabel_climatebert_f| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.6 MB| + +## References + +https://huggingface.co/GIZ/TAPP-multilabel-climatebert_f \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-trainer8_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-22-trainer8_pipeline_en.md new file mode 100644 index 00000000000000..2df849cd3b3518 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-trainer8_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English trainer8_pipeline pipeline DistilBertForSequenceClassification from SimoneJLaudani +author: John Snow Labs +name: trainer8_pipeline +date: 2024-09-22 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`trainer8_pipeline` is a English model originally trained by SimoneJLaudani. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/trainer8_pipeline_en_5.5.0_3.0_1727020498492.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/trainer8_pipeline_en_5.5.0_3.0_1727020498492.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("trainer8_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("trainer8_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|trainer8_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/SimoneJLaudani/trainer8 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-whisper_small_arabic_huggingpanda_en.md b/docs/_posts/ahmedlone127/2024-09-22-whisper_small_arabic_huggingpanda_en.md new file mode 100644 index 00000000000000..0416eac15d2384 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-whisper_small_arabic_huggingpanda_en.md @@ -0,0 +1,84 @@ +--- +layout: model +title: English whisper_small_arabic_huggingpanda WhisperForCTC from HuggingPanda +author: John Snow Labs +name: whisper_small_arabic_huggingpanda +date: 2024-09-22 +tags: [en, open_source, onnx, asr, whisper] +task: Automatic Speech Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: WhisperForCTC +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`whisper_small_arabic_huggingpanda` is a English model originally trained by HuggingPanda. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/whisper_small_arabic_huggingpanda_en_5.5.0_3.0_1727022945732.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/whisper_small_arabic_huggingpanda_en_5.5.0_3.0_1727022945732.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +audioAssembler = AudioAssembler() \ + .setInputCol("audio_content") \ + .setOutputCol("audio_assembler") + +speechToText = WhisperForCTC.pretrained("whisper_small_arabic_huggingpanda","en") \ + .setInputCols(["audio_assembler"]) \ + .setOutputCol("text") + +pipeline = Pipeline().setStages([audioAssembler, speechToText]) +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val audioAssembler = new DocumentAssembler() + .setInputCols("audio_content") + .setOutputCols("audio_assembler") + +val speechToText = WhisperForCTC.pretrained("whisper_small_arabic_huggingpanda", "en") + .setInputCols(Array("audio_assembler")) + .setOutputCol("text") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, speechToText)) +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|whisper_small_arabic_huggingpanda| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[audio_assembler]| +|Output Labels:|[text]| +|Language:|en| +|Size:|1.7 GB| + +## References + +https://huggingface.co/HuggingPanda/whisper-small-arabic \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-22-xlm_roberta_base_finetuned_panx_all_ashrielbrian_en.md b/docs/_posts/ahmedlone127/2024-09-22-xlm_roberta_base_finetuned_panx_all_ashrielbrian_en.md new file mode 100644 index 00000000000000..b342f2cea1240e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-22-xlm_roberta_base_finetuned_panx_all_ashrielbrian_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_all_ashrielbrian XlmRoBertaForTokenClassification from ashrielbrian +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_all_ashrielbrian +date: 2024-09-22 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_all_ashrielbrian` is a English model originally trained by ashrielbrian. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_all_ashrielbrian_en_5.5.0_3.0_1726970644262.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_all_ashrielbrian_en_5.5.0_3.0_1726970644262.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_all_ashrielbrian","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_all_ashrielbrian", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_all_ashrielbrian| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|858.2 MB| + +## References + +https://huggingface.co/ashrielbrian/xlm-roberta-base-finetuned-panx-all \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-ag_news_roberta_base_seed_2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-23-ag_news_roberta_base_seed_2_pipeline_en.md new file mode 100644 index 00000000000000..5bc9692043e315 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-ag_news_roberta_base_seed_2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English ag_news_roberta_base_seed_2_pipeline pipeline RoBertaForSequenceClassification from utahnlp +author: John Snow Labs +name: ag_news_roberta_base_seed_2_pipeline +date: 2024-09-23 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ag_news_roberta_base_seed_2_pipeline` is a English model originally trained by utahnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ag_news_roberta_base_seed_2_pipeline_en_5.5.0_3.0_1727055279043.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ag_news_roberta_base_seed_2_pipeline_en_5.5.0_3.0_1727055279043.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("ag_news_roberta_base_seed_2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("ag_news_roberta_base_seed_2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ag_news_roberta_base_seed_2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|463.3 MB| + +## References + +https://huggingface.co/utahnlp/ag_news_roberta-base_seed-2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-all_roberta_large_v1_auto_and_commute_1000_16_5_oos_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-23-all_roberta_large_v1_auto_and_commute_1000_16_5_oos_pipeline_en.md new file mode 100644 index 00000000000000..ac91be5fa42791 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-all_roberta_large_v1_auto_and_commute_1000_16_5_oos_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English all_roberta_large_v1_auto_and_commute_1000_16_5_oos_pipeline pipeline RoBertaForSequenceClassification from fathyshalab +author: John Snow Labs +name: all_roberta_large_v1_auto_and_commute_1000_16_5_oos_pipeline +date: 2024-09-23 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_roberta_large_v1_auto_and_commute_1000_16_5_oos_pipeline` is a English model originally trained by fathyshalab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_auto_and_commute_1000_16_5_oos_pipeline_en_5.5.0_3.0_1727134945968.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_auto_and_commute_1000_16_5_oos_pipeline_en_5.5.0_3.0_1727134945968.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("all_roberta_large_v1_auto_and_commute_1000_16_5_oos_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("all_roberta_large_v1_auto_and_commute_1000_16_5_oos_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_roberta_large_v1_auto_and_commute_1000_16_5_oos_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/fathyshalab/all-roberta-large-v1-auto_and_commute-1000-16-5-oos + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-augmented_model_fast_1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-23-augmented_model_fast_1_pipeline_en.md new file mode 100644 index 00000000000000..dd6444691d28eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-augmented_model_fast_1_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English augmented_model_fast_1_pipeline pipeline DistilBertForSequenceClassification from LeonardoFettucciari +author: John Snow Labs +name: augmented_model_fast_1_pipeline +date: 2024-09-23 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`augmented_model_fast_1_pipeline` is a English model originally trained by LeonardoFettucciari. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/augmented_model_fast_1_pipeline_en_5.5.0_3.0_1727073536369.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/augmented_model_fast_1_pipeline_en_5.5.0_3.0_1727073536369.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("augmented_model_fast_1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("augmented_model_fast_1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|augmented_model_fast_1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/LeonardoFettucciari/augmented_model_fast_1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-autotrain_xlmroberta_iuexist_50302120401_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-23-autotrain_xlmroberta_iuexist_50302120401_pipeline_en.md new file mode 100644 index 00000000000000..fe5ecfd435626e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-autotrain_xlmroberta_iuexist_50302120401_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English autotrain_xlmroberta_iuexist_50302120401_pipeline pipeline XlmRoBertaForSequenceClassification from Muhsabrys +author: John Snow Labs +name: autotrain_xlmroberta_iuexist_50302120401_pipeline +date: 2024-09-23 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_xlmroberta_iuexist_50302120401_pipeline` is a English model originally trained by Muhsabrys. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_xlmroberta_iuexist_50302120401_pipeline_en_5.5.0_3.0_1727126044084.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_xlmroberta_iuexist_50302120401_pipeline_en_5.5.0_3.0_1727126044084.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("autotrain_xlmroberta_iuexist_50302120401_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("autotrain_xlmroberta_iuexist_50302120401_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_xlmroberta_iuexist_50302120401_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/Muhsabrys/autotrain-xlmroberta-iuexist-50302120401 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-bert_base_uncase_conll2012_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-23-bert_base_uncase_conll2012_pipeline_en.md new file mode 100644 index 00000000000000..84ac539e546e3f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-bert_base_uncase_conll2012_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncase_conll2012_pipeline pipeline BertForTokenClassification from sarveshsk +author: John Snow Labs +name: bert_base_uncase_conll2012_pipeline +date: 2024-09-23 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncase_conll2012_pipeline` is a English model originally trained by sarveshsk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncase_conll2012_pipeline_en_5.5.0_3.0_1727111323814.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncase_conll2012_pipeline_en_5.5.0_3.0_1727111323814.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncase_conll2012_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncase_conll2012_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncase_conll2012_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.3 MB| + +## References + +https://huggingface.co/sarveshsk/bert_base_uncase_Conll2012 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-bert_base_uncased_conll2003_joshuaphua_en.md b/docs/_posts/ahmedlone127/2024-09-23-bert_base_uncased_conll2003_joshuaphua_en.md new file mode 100644 index 00000000000000..6fdf5ea9aa5a79 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-bert_base_uncased_conll2003_joshuaphua_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_conll2003_joshuaphua BertForTokenClassification from joshuaphua +author: John Snow Labs +name: bert_base_uncased_conll2003_joshuaphua +date: 2024-09-23 +tags: [en, open_source, onnx, token_classification, bert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_conll2003_joshuaphua` is a English model originally trained by joshuaphua. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_conll2003_joshuaphua_en_5.5.0_3.0_1727098255511.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_conll2003_joshuaphua_en_5.5.0_3.0_1727098255511.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_uncased_conll2003_joshuaphua","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_base_uncased_conll2003_joshuaphua", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_conll2003_joshuaphua| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/joshuaphua/bert-base-uncased-conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-bert_large_uncased_whole_word_masking_finetuned_squad_dev_one_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-23-bert_large_uncased_whole_word_masking_finetuned_squad_dev_one_pipeline_en.md new file mode 100644 index 00000000000000..800eb507ae57fc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-bert_large_uncased_whole_word_masking_finetuned_squad_dev_one_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bert_large_uncased_whole_word_masking_finetuned_squad_dev_one_pipeline pipeline BertForQuestionAnswering from mdzrg +author: John Snow Labs +name: bert_large_uncased_whole_word_masking_finetuned_squad_dev_one_pipeline +date: 2024-09-23 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_whole_word_masking_finetuned_squad_dev_one_pipeline` is a English model originally trained by mdzrg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_whole_word_masking_finetuned_squad_dev_one_pipeline_en_5.5.0_3.0_1727049738034.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_whole_word_masking_finetuned_squad_dev_one_pipeline_en_5.5.0_3.0_1727049738034.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_large_uncased_whole_word_masking_finetuned_squad_dev_one_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_large_uncased_whole_word_masking_finetuned_squad_dev_one_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_whole_word_masking_finetuned_squad_dev_one_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/mdzrg/bert-large-uncased-whole-word-masking-finetuned-squad-dev-one + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-bertin_roberta_base_spanish_finetuned_xnli_en.md b/docs/_posts/ahmedlone127/2024-09-23-bertin_roberta_base_spanish_finetuned_xnli_en.md new file mode 100644 index 00000000000000..cdf3220947d4f3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-bertin_roberta_base_spanish_finetuned_xnli_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bertin_roberta_base_spanish_finetuned_xnli RoBertaForSequenceClassification from dccuchile +author: John Snow Labs +name: bertin_roberta_base_spanish_finetuned_xnli +date: 2024-09-23 +tags: [en, open_source, onnx, sequence_classification, roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bertin_roberta_base_spanish_finetuned_xnli` is a English model originally trained by dccuchile. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bertin_roberta_base_spanish_finetuned_xnli_en_5.5.0_3.0_1727135230494.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bertin_roberta_base_spanish_finetuned_xnli_en_5.5.0_3.0_1727135230494.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("bertin_roberta_base_spanish_finetuned_xnli","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("bertin_roberta_base_spanish_finetuned_xnli", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bertin_roberta_base_spanish_finetuned_xnli| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|464.5 MB| + +## References + +https://huggingface.co/dccuchile/bertin-roberta-base-spanish-finetuned-xnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-burmese_awesome_model_bertester_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-23-burmese_awesome_model_bertester_pipeline_en.md new file mode 100644 index 00000000000000..ce0e2ad37aad6a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-burmese_awesome_model_bertester_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English burmese_awesome_model_bertester_pipeline pipeline DistilBertForSequenceClassification from bertester +author: John Snow Labs +name: burmese_awesome_model_bertester_pipeline +date: 2024-09-23 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_model_bertester_pipeline` is a English model originally trained by bertester. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_bertester_pipeline_en_5.5.0_3.0_1727059754192.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_bertester_pipeline_en_5.5.0_3.0_1727059754192.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("burmese_awesome_model_bertester_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("burmese_awesome_model_bertester_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_model_bertester_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/bertester/my_awesome_model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-burmese_awesome_model_nataliacristina_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-23-burmese_awesome_model_nataliacristina_pipeline_en.md new file mode 100644 index 00000000000000..b9d449cd1ce559 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-burmese_awesome_model_nataliacristina_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English burmese_awesome_model_nataliacristina_pipeline pipeline DistilBertForSequenceClassification from nataliacristina +author: John Snow Labs +name: burmese_awesome_model_nataliacristina_pipeline +date: 2024-09-23 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_model_nataliacristina_pipeline` is a English model originally trained by nataliacristina. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_nataliacristina_pipeline_en_5.5.0_3.0_1727073759521.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_nataliacristina_pipeline_en_5.5.0_3.0_1727073759521.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("burmese_awesome_model_nataliacristina_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("burmese_awesome_model_nataliacristina_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_model_nataliacristina_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/nataliacristina/my_awesome_model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-distil_whisper_medium_hindi_test_v2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-23-distil_whisper_medium_hindi_test_v2_pipeline_en.md new file mode 100644 index 00000000000000..525954a88fc0f6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-distil_whisper_medium_hindi_test_v2_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English distil_whisper_medium_hindi_test_v2_pipeline pipeline WhisperForCTC from yi-ching +author: John Snow Labs +name: distil_whisper_medium_hindi_test_v2_pipeline +date: 2024-09-23 +tags: [en, open_source, pipeline, onnx] +task: Automatic Speech Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distil_whisper_medium_hindi_test_v2_pipeline` is a English model originally trained by yi-ching. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distil_whisper_medium_hindi_test_v2_pipeline_en_5.5.0_3.0_1727077847700.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distil_whisper_medium_hindi_test_v2_pipeline_en_5.5.0_3.0_1727077847700.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distil_whisper_medium_hindi_test_v2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distil_whisper_medium_hindi_test_v2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distil_whisper_medium_hindi_test_v2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.8 GB| + +## References + +https://huggingface.co/yi-ching/distil-whisper-medium-hi-test-v2 + +## Included Models + +- AudioAssembler +- WhisperForCTC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-distilbert_base_uncased_odm_zphr_0st42sd_ut72ut1_plprefix0stlarge41_simsp_en.md b/docs/_posts/ahmedlone127/2024-09-23-distilbert_base_uncased_odm_zphr_0st42sd_ut72ut1_plprefix0stlarge41_simsp_en.md new file mode 100644 index 00000000000000..b2ef0d618c3295 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-distilbert_base_uncased_odm_zphr_0st42sd_ut72ut1_plprefix0stlarge41_simsp_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_odm_zphr_0st42sd_ut72ut1_plprefix0stlarge41_simsp DistilBertForSequenceClassification from tom192180 +author: John Snow Labs +name: distilbert_base_uncased_odm_zphr_0st42sd_ut72ut1_plprefix0stlarge41_simsp +date: 2024-09-23 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_odm_zphr_0st42sd_ut72ut1_plprefix0stlarge41_simsp` is a English model originally trained by tom192180. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_odm_zphr_0st42sd_ut72ut1_plprefix0stlarge41_simsp_en_5.5.0_3.0_1727059649738.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_odm_zphr_0st42sd_ut72ut1_plprefix0stlarge41_simsp_en_5.5.0_3.0_1727059649738.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_odm_zphr_0st42sd_ut72ut1_plprefix0stlarge41_simsp","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_odm_zphr_0st42sd_ut72ut1_plprefix0stlarge41_simsp", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_odm_zphr_0st42sd_ut72ut1_plprefix0stlarge41_simsp| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.6 MB| + +## References + +https://huggingface.co/tom192180/distilbert-base-uncased_odm_zphr_0st42sd_ut72ut1_PLPrefix0stlarge41_simsp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-distilbert_base_uncased_sancho3010_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-23-distilbert_base_uncased_sancho3010_pipeline_en.md new file mode 100644 index 00000000000000..140925bb08dc27 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-distilbert_base_uncased_sancho3010_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_sancho3010_pipeline pipeline DistilBertForSequenceClassification from Sancho3010 +author: John Snow Labs +name: distilbert_base_uncased_sancho3010_pipeline +date: 2024-09-23 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_sancho3010_pipeline` is a English model originally trained by Sancho3010. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_sancho3010_pipeline_en_5.5.0_3.0_1727059132555.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_sancho3010_pipeline_en_5.5.0_3.0_1727059132555.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_sancho3010_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_sancho3010_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_sancho3010_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Sancho3010/distilbert-base-uncased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-distilbert_finetuned_go_emotions_dataset_en.md b/docs/_posts/ahmedlone127/2024-09-23-distilbert_finetuned_go_emotions_dataset_en.md new file mode 100644 index 00000000000000..41bdeb8d7c5425 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-distilbert_finetuned_go_emotions_dataset_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_finetuned_go_emotions_dataset DistilBertForSequenceClassification from abdurrahman22224 +author: John Snow Labs +name: distilbert_finetuned_go_emotions_dataset +date: 2024-09-23 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_go_emotions_dataset` is a English model originally trained by abdurrahman22224. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_go_emotions_dataset_en_5.5.0_3.0_1727087006198.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_go_emotions_dataset_en_5.5.0_3.0_1727087006198.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_finetuned_go_emotions_dataset","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_finetuned_go_emotions_dataset", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_go_emotions_dataset| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/abdurrahman22224/distilbert-finetuned-go-emotions_dataset \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-distilbert_imdb_decentmakeover13_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-23-distilbert_imdb_decentmakeover13_pipeline_en.md new file mode 100644 index 00000000000000..ac38bdcb8028cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-distilbert_imdb_decentmakeover13_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_imdb_decentmakeover13_pipeline pipeline DistilBertForSequenceClassification from decentmakeover13 +author: John Snow Labs +name: distilbert_imdb_decentmakeover13_pipeline +date: 2024-09-23 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_imdb_decentmakeover13_pipeline` is a English model originally trained by decentmakeover13. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_imdb_decentmakeover13_pipeline_en_5.5.0_3.0_1727082296688.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_imdb_decentmakeover13_pipeline_en_5.5.0_3.0_1727082296688.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_imdb_decentmakeover13_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_imdb_decentmakeover13_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_imdb_decentmakeover13_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/decentmakeover13/distilbert-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-finetuning_sentiment_model_3000_samples_murali07_en.md b/docs/_posts/ahmedlone127/2024-09-23-finetuning_sentiment_model_3000_samples_murali07_en.md new file mode 100644 index 00000000000000..955c808e8cc95d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-finetuning_sentiment_model_3000_samples_murali07_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_murali07 DistilBertForSequenceClassification from murali07 +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_murali07 +date: 2024-09-23 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_murali07` is a English model originally trained by murali07. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_murali07_en_5.5.0_3.0_1727108473481.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_murali07_en_5.5.0_3.0_1727108473481.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_murali07","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_murali07", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_murali07| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/murali07/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-finetuning_sentiment_model_3000_samples_thanhchauns2_en.md b/docs/_posts/ahmedlone127/2024-09-23-finetuning_sentiment_model_3000_samples_thanhchauns2_en.md new file mode 100644 index 00000000000000..2fc790f34f69bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-finetuning_sentiment_model_3000_samples_thanhchauns2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_thanhchauns2 DistilBertForSequenceClassification from thanhchauns2 +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_thanhchauns2 +date: 2024-09-23 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_thanhchauns2` is a English model originally trained by thanhchauns2. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_thanhchauns2_en_5.5.0_3.0_1727082610717.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_thanhchauns2_en_5.5.0_3.0_1727082610717.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_thanhchauns2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_thanhchauns2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_thanhchauns2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/thanhchauns2/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-hihu5_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-23-hihu5_pipeline_en.md new file mode 100644 index 00000000000000..d3efecf63be9d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-hihu5_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English hihu5_pipeline pipeline XlmRoBertaForSequenceClassification from wnic00 +author: John Snow Labs +name: hihu5_pipeline +date: 2024-09-23 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hihu5_pipeline` is a English model originally trained by wnic00. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hihu5_pipeline_en_5.5.0_3.0_1727126809178.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hihu5_pipeline_en_5.5.0_3.0_1727126809178.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("hihu5_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("hihu5_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hihu5_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/wnic00/hihu5 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-hw01_chchang_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-23-hw01_chchang_pipeline_en.md new file mode 100644 index 00000000000000..d2353641c8c617 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-hw01_chchang_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English hw01_chchang_pipeline pipeline DistilBertForSequenceClassification from CHChang +author: John Snow Labs +name: hw01_chchang_pipeline +date: 2024-09-23 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hw01_chchang_pipeline` is a English model originally trained by CHChang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hw01_chchang_pipeline_en_5.5.0_3.0_1727093810821.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hw01_chchang_pipeline_en_5.5.0_3.0_1727093810821.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("hw01_chchang_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("hw01_chchang_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hw01_chchang_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/CHChang/HW01 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-interview_classifier_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-23-interview_classifier_pipeline_en.md new file mode 100644 index 00000000000000..2c4a636fe30db1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-interview_classifier_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English interview_classifier_pipeline pipeline DistilBertForSequenceClassification from eskayML +author: John Snow Labs +name: interview_classifier_pipeline +date: 2024-09-23 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`interview_classifier_pipeline` is a English model originally trained by eskayML. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/interview_classifier_pipeline_en_5.5.0_3.0_1727073536315.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/interview_classifier_pipeline_en_5.5.0_3.0_1727073536315.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("interview_classifier_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("interview_classifier_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|interview_classifier_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/eskayML/interview_classifier + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-msroberta_en.md b/docs/_posts/ahmedlone127/2024-09-23-msroberta_en.md new file mode 100644 index 00000000000000..258429c3fe5396 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-msroberta_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English msroberta RoBertaEmbeddings from nkoh01 +author: John Snow Labs +name: msroberta +date: 2024-09-23 +tags: [en, open_source, onnx, embeddings, roberta] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`msroberta` is a English model originally trained by nkoh01. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/msroberta_en_5.5.0_3.0_1727056610581.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/msroberta_en_5.5.0_3.0_1727056610581.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = RoBertaEmbeddings.pretrained("msroberta","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = RoBertaEmbeddings.pretrained("msroberta","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|msroberta| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[roberta]| +|Language:|en| +|Size:|306.5 MB| + +## References + +https://huggingface.co/nkoh01/MSRoberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-nepal_bhasa_pretrained_model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-23-nepal_bhasa_pretrained_model_pipeline_en.md new file mode 100644 index 00000000000000..75cdb2221090da --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-nepal_bhasa_pretrained_model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English nepal_bhasa_pretrained_model_pipeline pipeline DistilBertForSequenceClassification from Vigneshwaran255 +author: John Snow Labs +name: nepal_bhasa_pretrained_model_pipeline +date: 2024-09-23 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nepal_bhasa_pretrained_model_pipeline` is a English model originally trained by Vigneshwaran255. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nepal_bhasa_pretrained_model_pipeline_en_5.5.0_3.0_1727073871051.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nepal_bhasa_pretrained_model_pipeline_en_5.5.0_3.0_1727073871051.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("nepal_bhasa_pretrained_model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("nepal_bhasa_pretrained_model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nepal_bhasa_pretrained_model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Vigneshwaran255/new_pretrained_model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-newsmodelclassification_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-23-newsmodelclassification_pipeline_en.md new file mode 100644 index 00000000000000..f0f79a12d9daa9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-newsmodelclassification_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English newsmodelclassification_pipeline pipeline DistilBertForSequenceClassification from aatmasidha +author: John Snow Labs +name: newsmodelclassification_pipeline +date: 2024-09-23 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`newsmodelclassification_pipeline` is a English model originally trained by aatmasidha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/newsmodelclassification_pipeline_en_5.5.0_3.0_1727082488143.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/newsmodelclassification_pipeline_en_5.5.0_3.0_1727082488143.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("newsmodelclassification_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("newsmodelclassification_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|newsmodelclassification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/aatmasidha/newsmodelclassification + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-pruned_30_model_en.md b/docs/_posts/ahmedlone127/2024-09-23-pruned_30_model_en.md new file mode 100644 index 00000000000000..0d73bfe7cf3002 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-pruned_30_model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English pruned_30_model DistilBertForSequenceClassification from andygoh5 +author: John Snow Labs +name: pruned_30_model +date: 2024-09-23 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pruned_30_model` is a English model originally trained by andygoh5. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pruned_30_model_en_5.5.0_3.0_1727082405714.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pruned_30_model_en_5.5.0_3.0_1727082405714.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("pruned_30_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("pruned_30_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pruned_30_model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/andygoh5/pruned-30-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-qa_model_gigazinie_en.md b/docs/_posts/ahmedlone127/2024-09-23-qa_model_gigazinie_en.md new file mode 100644 index 00000000000000..9452d5737f602c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-qa_model_gigazinie_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English qa_model_gigazinie BertForQuestionAnswering from Gigazinie +author: John Snow Labs +name: qa_model_gigazinie +date: 2024-09-23 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_model_gigazinie` is a English model originally trained by Gigazinie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_model_gigazinie_en_5.5.0_3.0_1727070448679.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_model_gigazinie_en_5.5.0_3.0_1727070448679.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("qa_model_gigazinie","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("qa_model_gigazinie", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_model_gigazinie| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/Gigazinie/QA_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-results_yildizt_en.md b/docs/_posts/ahmedlone127/2024-09-23-results_yildizt_en.md new file mode 100644 index 00000000000000..0f22bd496e293f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-results_yildizt_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English results_yildizt DistilBertForSequenceClassification from yildizt +author: John Snow Labs +name: results_yildizt +date: 2024-09-23 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`results_yildizt` is a English model originally trained by yildizt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/results_yildizt_en_5.5.0_3.0_1727087298419.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/results_yildizt_en_5.5.0_3.0_1727087298419.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("results_yildizt","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("results_yildizt", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|results_yildizt| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/yildizt/results \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-roberta_base_go_emotions_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-23-roberta_base_go_emotions_pipeline_en.md new file mode 100644 index 00000000000000..d9d8845e51ab17 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-roberta_base_go_emotions_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English roberta_base_go_emotions_pipeline pipeline DistilBertForSequenceClassification from Laddoo +author: John Snow Labs +name: roberta_base_go_emotions_pipeline +date: 2024-09-23 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_go_emotions_pipeline` is a English model originally trained by Laddoo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_go_emotions_pipeline_en_5.5.0_3.0_1727082400104.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_go_emotions_pipeline_en_5.5.0_3.0_1727082400104.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("roberta_base_go_emotions_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("roberta_base_go_emotions_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_go_emotions_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Laddoo/roberta-base-go_emotions + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-roberta_nba_v2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-23-roberta_nba_v2_pipeline_en.md new file mode 100644 index 00000000000000..b01f31c4a2e03d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-roberta_nba_v2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English roberta_nba_v2_pipeline pipeline RoBertaForSequenceClassification from sivakarri +author: John Snow Labs +name: roberta_nba_v2_pipeline +date: 2024-09-23 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_nba_v2_pipeline` is a English model originally trained by sivakarri. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_nba_v2_pipeline_en_5.5.0_3.0_1727055341552.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_nba_v2_pipeline_en_5.5.0_3.0_1727055341552.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("roberta_nba_v2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("roberta_nba_v2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_nba_v2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|430.8 MB| + +## References + +https://huggingface.co/sivakarri/roberta_nba_v2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-sent_bert_base_english_spanish_portuguese_cased_en.md b/docs/_posts/ahmedlone127/2024-09-23-sent_bert_base_english_spanish_portuguese_cased_en.md new file mode 100644 index 00000000000000..fe817582358aa8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-sent_bert_base_english_spanish_portuguese_cased_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sent_bert_base_english_spanish_portuguese_cased BertSentenceEmbeddings from Geotrend +author: John Snow Labs +name: sent_bert_base_english_spanish_portuguese_cased +date: 2024-09-23 +tags: [en, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_bert_base_english_spanish_portuguese_cased` is a English model originally trained by Geotrend. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_bert_base_english_spanish_portuguese_cased_en_5.5.0_3.0_1727091100328.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_bert_base_english_spanish_portuguese_cased_en_5.5.0_3.0_1727091100328.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_bert_base_english_spanish_portuguese_cased","en") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_bert_base_english_spanish_portuguese_cased","en") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_bert_base_english_spanish_portuguese_cased| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|428.0 MB| + +## References + +https://huggingface.co/Geotrend/bert-base-en-es-pt-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-sent_biobert_patent_reference_extraction_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-23-sent_biobert_patent_reference_extraction_pipeline_en.md new file mode 100644 index 00000000000000..5a8861791845df --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-sent_biobert_patent_reference_extraction_pipeline_en.md @@ -0,0 +1,71 @@ +--- +layout: model +title: English sent_biobert_patent_reference_extraction_pipeline pipeline BertSentenceEmbeddings from kaesve +author: John Snow Labs +name: sent_biobert_patent_reference_extraction_pipeline +date: 2024-09-23 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_biobert_patent_reference_extraction_pipeline` is a English model originally trained by kaesve. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_biobert_patent_reference_extraction_pipeline_en_5.5.0_3.0_1727123318667.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_biobert_patent_reference_extraction_pipeline_en_5.5.0_3.0_1727123318667.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_biobert_patent_reference_extraction_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_biobert_patent_reference_extraction_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_biobert_patent_reference_extraction_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/kaesve/BioBERT_patent_reference_extraction + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-sent_malay_bert_en.md b/docs/_posts/ahmedlone127/2024-09-23-sent_malay_bert_en.md new file mode 100644 index 00000000000000..1345e8194cf4c0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-sent_malay_bert_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sent_malay_bert BertSentenceEmbeddings from NLP4H +author: John Snow Labs +name: sent_malay_bert +date: 2024-09-23 +tags: [en, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_malay_bert` is a English model originally trained by NLP4H. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_malay_bert_en_5.5.0_3.0_1727101690378.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_malay_bert_en_5.5.0_3.0_1727101690378.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_malay_bert","en") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_malay_bert","en") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_malay_bert| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/NLP4H/ms_bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-t_5_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-23-t_5_pipeline_en.md new file mode 100644 index 00000000000000..c635bc0691faab --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-t_5_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English t_5_pipeline pipeline RoBertaForSequenceClassification from Pablojmed +author: John Snow Labs +name: t_5_pipeline +date: 2024-09-23 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t_5_pipeline` is a English model originally trained by Pablojmed. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t_5_pipeline_en_5.5.0_3.0_1727134804497.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t_5_pipeline_en_5.5.0_3.0_1727134804497.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("t_5_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("t_5_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t_5_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|438.2 MB| + +## References + +https://huggingface.co/Pablojmed/t_5 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-test_trainerb2_en.md b/docs/_posts/ahmedlone127/2024-09-23-test_trainerb2_en.md new file mode 100644 index 00000000000000..55dfb338cb7acd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-test_trainerb2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English test_trainerb2 DistilBertForSequenceClassification from SimoneJLaudani +author: John Snow Labs +name: test_trainerb2 +date: 2024-09-23 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_trainerb2` is a English model originally trained by SimoneJLaudani. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_trainerb2_en_5.5.0_3.0_1727110741823.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_trainerb2_en_5.5.0_3.0_1727110741823.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("test_trainerb2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("test_trainerb2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_trainerb2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SimoneJLaudani/test_trainerb2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-user_476da26872df492f830a65925d422651_model_ja.md b/docs/_posts/ahmedlone127/2024-09-23-user_476da26872df492f830a65925d422651_model_ja.md new file mode 100644 index 00000000000000..546486d1843357 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-user_476da26872df492f830a65925d422651_model_ja.md @@ -0,0 +1,84 @@ +--- +layout: model +title: Japanese user_476da26872df492f830a65925d422651_model WhisperForCTC from hoangvanvietanh +author: John Snow Labs +name: user_476da26872df492f830a65925d422651_model +date: 2024-09-23 +tags: [ja, open_source, onnx, asr, whisper] +task: Automatic Speech Recognition +language: ja +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: WhisperForCTC +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`user_476da26872df492f830a65925d422651_model` is a Japanese model originally trained by hoangvanvietanh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/user_476da26872df492f830a65925d422651_model_ja_5.5.0_3.0_1727076450089.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/user_476da26872df492f830a65925d422651_model_ja_5.5.0_3.0_1727076450089.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +audioAssembler = AudioAssembler() \ + .setInputCol("audio_content") \ + .setOutputCol("audio_assembler") + +speechToText = WhisperForCTC.pretrained("user_476da26872df492f830a65925d422651_model","ja") \ + .setInputCols(["audio_assembler"]) \ + .setOutputCol("text") + +pipeline = Pipeline().setStages([audioAssembler, speechToText]) +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val audioAssembler = new DocumentAssembler() + .setInputCols("audio_content") + .setOutputCols("audio_assembler") + +val speechToText = WhisperForCTC.pretrained("user_476da26872df492f830a65925d422651_model", "ja") + .setInputCols(Array("audio_assembler")) + .setOutputCol("text") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, speechToText)) +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|user_476da26872df492f830a65925d422651_model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[audio_assembler]| +|Output Labels:|[text]| +|Language:|ja| +|Size:|1.7 GB| + +## References + +https://huggingface.co/hoangvanvietanh/user_476da26872df492f830a65925d422651_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-whisper_small_ar2_pipeline_ar.md b/docs/_posts/ahmedlone127/2024-09-23-whisper_small_ar2_pipeline_ar.md new file mode 100644 index 00000000000000..712646ffd155c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-whisper_small_ar2_pipeline_ar.md @@ -0,0 +1,69 @@ +--- +layout: model +title: Arabic whisper_small_ar2_pipeline pipeline WhisperForCTC from whitefox123 +author: John Snow Labs +name: whisper_small_ar2_pipeline +date: 2024-09-23 +tags: [ar, open_source, pipeline, onnx] +task: Automatic Speech Recognition +language: ar +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`whisper_small_ar2_pipeline` is a Arabic model originally trained by whitefox123. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/whisper_small_ar2_pipeline_ar_5.5.0_3.0_1727119392969.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/whisper_small_ar2_pipeline_ar_5.5.0_3.0_1727119392969.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("whisper_small_ar2_pipeline", lang = "ar") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("whisper_small_ar2_pipeline", lang = "ar") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|whisper_small_ar2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|ar| +|Size:|1.7 GB| + +## References + +https://huggingface.co/whitefox123/whisper-small-ar2 + +## Included Models + +- AudioAssembler +- WhisperForCTC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-whisper_small_hindi_auro_hi.md b/docs/_posts/ahmedlone127/2024-09-23-whisper_small_hindi_auro_hi.md new file mode 100644 index 00000000000000..0a19d9194a6e80 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-whisper_small_hindi_auro_hi.md @@ -0,0 +1,84 @@ +--- +layout: model +title: Hindi whisper_small_hindi_auro WhisperForCTC from auro +author: John Snow Labs +name: whisper_small_hindi_auro +date: 2024-09-23 +tags: [hi, open_source, onnx, asr, whisper] +task: Automatic Speech Recognition +language: hi +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: WhisperForCTC +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`whisper_small_hindi_auro` is a Hindi model originally trained by auro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/whisper_small_hindi_auro_hi_5.5.0_3.0_1727078485296.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/whisper_small_hindi_auro_hi_5.5.0_3.0_1727078485296.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +audioAssembler = AudioAssembler() \ + .setInputCol("audio_content") \ + .setOutputCol("audio_assembler") + +speechToText = WhisperForCTC.pretrained("whisper_small_hindi_auro","hi") \ + .setInputCols(["audio_assembler"]) \ + .setOutputCol("text") + +pipeline = Pipeline().setStages([audioAssembler, speechToText]) +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val audioAssembler = new DocumentAssembler() + .setInputCols("audio_content") + .setOutputCols("audio_assembler") + +val speechToText = WhisperForCTC.pretrained("whisper_small_hindi_auro", "hi") + .setInputCols(Array("audio_assembler")) + .setOutputCol("text") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, speechToText)) +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|whisper_small_hindi_auro| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[audio_assembler]| +|Output Labels:|[text]| +|Language:|hi| +|Size:|1.7 GB| + +## References + +https://huggingface.co/auro/whisper-small-hi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-whisper_small_hindi_talium_hi.md b/docs/_posts/ahmedlone127/2024-09-23-whisper_small_hindi_talium_hi.md new file mode 100644 index 00000000000000..f8aabfa8e07d40 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-whisper_small_hindi_talium_hi.md @@ -0,0 +1,84 @@ +--- +layout: model +title: Hindi whisper_small_hindi_talium WhisperForCTC from Talium +author: John Snow Labs +name: whisper_small_hindi_talium +date: 2024-09-23 +tags: [hi, open_source, onnx, asr, whisper] +task: Automatic Speech Recognition +language: hi +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: WhisperForCTC +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`whisper_small_hindi_talium` is a Hindi model originally trained by Talium. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/whisper_small_hindi_talium_hi_5.5.0_3.0_1727119306365.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/whisper_small_hindi_talium_hi_5.5.0_3.0_1727119306365.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +audioAssembler = AudioAssembler() \ + .setInputCol("audio_content") \ + .setOutputCol("audio_assembler") + +speechToText = WhisperForCTC.pretrained("whisper_small_hindi_talium","hi") \ + .setInputCols(["audio_assembler"]) \ + .setOutputCol("text") + +pipeline = Pipeline().setStages([audioAssembler, speechToText]) +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val audioAssembler = new DocumentAssembler() + .setInputCols("audio_content") + .setOutputCols("audio_assembler") + +val speechToText = WhisperForCTC.pretrained("whisper_small_hindi_talium", "hi") + .setInputCols(Array("audio_assembler")) + .setOutputCol("text") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, speechToText)) +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|whisper_small_hindi_talium| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[audio_assembler]| +|Output Labels:|[text]| +|Language:|hi| +|Size:|1.7 GB| + +## References + +https://huggingface.co/Talium/whisper-small-hi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-xlm_roberta_base_finetuned_panx_german_french_ligerre_en.md b/docs/_posts/ahmedlone127/2024-09-23-xlm_roberta_base_finetuned_panx_german_french_ligerre_en.md new file mode 100644 index 00000000000000..1ede205e533a7f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-xlm_roberta_base_finetuned_panx_german_french_ligerre_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_french_ligerre XlmRoBertaForTokenClassification from ligerre +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_french_ligerre +date: 2024-09-23 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_french_ligerre` is a English model originally trained by ligerre. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_french_ligerre_en_5.5.0_3.0_1727132660091.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_french_ligerre_en_5.5.0_3.0_1727132660091.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_french_ligerre","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_french_ligerre", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_french_ligerre| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|858.2 MB| + +## References + +https://huggingface.co/ligerre/xlm-roberta-base-finetuned-panx-de-fr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-23-xlm_roberta_base_finetuned_panx_italian_ryatora_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-23-xlm_roberta_base_finetuned_panx_italian_ryatora_pipeline_en.md new file mode 100644 index 00000000000000..3d900b2d58606f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-23-xlm_roberta_base_finetuned_panx_italian_ryatora_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_italian_ryatora_pipeline pipeline XlmRoBertaForTokenClassification from ryatora +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_italian_ryatora_pipeline +date: 2024-09-23 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_italian_ryatora_pipeline` is a English model originally trained by ryatora. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_italian_ryatora_pipeline_en_5.5.0_3.0_1727133354181.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_italian_ryatora_pipeline_en_5.5.0_3.0_1727133354181.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_italian_ryatora_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_italian_ryatora_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_italian_ryatora_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|828.6 MB| + +## References + +https://huggingface.co/ryatora/xlm-roberta-base-finetuned-panx-it + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-albert_news_classification_pipeline_tw.md b/docs/_posts/ahmedlone127/2024-09-24-albert_news_classification_pipeline_tw.md new file mode 100644 index 00000000000000..75804cb190d33e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-albert_news_classification_pipeline_tw.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Twi albert_news_classification_pipeline pipeline BertForSequenceClassification from clhuang +author: John Snow Labs +name: albert_news_classification_pipeline +date: 2024-09-24 +tags: [tw, open_source, pipeline, onnx] +task: Text Classification +language: tw +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`albert_news_classification_pipeline` is a Twi model originally trained by clhuang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/albert_news_classification_pipeline_tw_5.5.0_3.0_1727213609028.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/albert_news_classification_pipeline_tw_5.5.0_3.0_1727213609028.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("albert_news_classification_pipeline", lang = "tw") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("albert_news_classification_pipeline", lang = "tw") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|albert_news_classification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|tw| +|Size:|39.8 MB| + +## References + +https://huggingface.co/clhuang/albert-news-classification + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-albert_test_model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-24-albert_test_model_pipeline_en.md new file mode 100644 index 00000000000000..b3279ba150ff9c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-albert_test_model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English albert_test_model_pipeline pipeline DistilBertForSequenceClassification from KalaiselvanD +author: John Snow Labs +name: albert_test_model_pipeline +date: 2024-09-24 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`albert_test_model_pipeline` is a English model originally trained by KalaiselvanD. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/albert_test_model_pipeline_en_5.5.0_3.0_1727204801407.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/albert_test_model_pipeline_en_5.5.0_3.0_1727204801407.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("albert_test_model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("albert_test_model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|albert_test_model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/KalaiselvanD/albert_test_model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-answer_equivalence_bert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-24-answer_equivalence_bert_pipeline_en.md new file mode 100644 index 00000000000000..3b823d0d26126e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-answer_equivalence_bert_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English answer_equivalence_bert_pipeline pipeline BertForSequenceClassification from zli12321 +author: John Snow Labs +name: answer_equivalence_bert_pipeline +date: 2024-09-24 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`answer_equivalence_bert_pipeline` is a English model originally trained by zli12321. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/answer_equivalence_bert_pipeline_en_5.5.0_3.0_1727219430880.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/answer_equivalence_bert_pipeline_en_5.5.0_3.0_1727219430880.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("answer_equivalence_bert_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("answer_equivalence_bert_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|answer_equivalence_bert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/zli12321/answer_equivalence_bert + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-bert_base_german_cased_gnad10_de.md b/docs/_posts/ahmedlone127/2024-09-24-bert_base_german_cased_gnad10_de.md new file mode 100644 index 00000000000000..d748732d4bc8db --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-bert_base_german_cased_gnad10_de.md @@ -0,0 +1,94 @@ +--- +layout: model +title: German bert_base_german_cased_gnad10 BertForSequenceClassification from laiking +author: John Snow Labs +name: bert_base_german_cased_gnad10 +date: 2024-09-24 +tags: [de, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: de +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_german_cased_gnad10` is a German model originally trained by laiking. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_german_cased_gnad10_de_5.5.0_3.0_1727214089427.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_german_cased_gnad10_de_5.5.0_3.0_1727214089427.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_german_cased_gnad10","de") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_german_cased_gnad10", "de") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_german_cased_gnad10| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|de| +|Size:|409.1 MB| + +## References + +https://huggingface.co/laiking/bert-base-german-cased-gnad10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-bert_base_german_cased_gnad10_pipeline_de.md b/docs/_posts/ahmedlone127/2024-09-24-bert_base_german_cased_gnad10_pipeline_de.md new file mode 100644 index 00000000000000..f6c50fdf6457d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-bert_base_german_cased_gnad10_pipeline_de.md @@ -0,0 +1,70 @@ +--- +layout: model +title: German bert_base_german_cased_gnad10_pipeline pipeline BertForSequenceClassification from laiking +author: John Snow Labs +name: bert_base_german_cased_gnad10_pipeline +date: 2024-09-24 +tags: [de, open_source, pipeline, onnx] +task: Text Classification +language: de +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_german_cased_gnad10_pipeline` is a German model originally trained by laiking. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_german_cased_gnad10_pipeline_de_5.5.0_3.0_1727214110542.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_german_cased_gnad10_pipeline_de_5.5.0_3.0_1727214110542.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_german_cased_gnad10_pipeline", lang = "de") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_german_cased_gnad10_pipeline", lang = "de") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_german_cased_gnad10_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|de| +|Size:|409.1 MB| + +## References + +https://huggingface.co/laiking/bert-base-german-cased-gnad10 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-bert_base_paws_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-24-bert_base_paws_pipeline_en.md new file mode 100644 index 00000000000000..3c5655ee5f8f2b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-bert_base_paws_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_paws_pipeline pipeline BertForSequenceClassification from harouzie +author: John Snow Labs +name: bert_base_paws_pipeline +date: 2024-09-24 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_paws_pipeline` is a English model originally trained by harouzie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_paws_pipeline_en_5.5.0_3.0_1727218998501.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_paws_pipeline_en_5.5.0_3.0_1727218998501.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_paws_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_paws_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_paws_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/harouzie/bert-base-paws + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-bert_base_uncased_ep_1_0_b_32_lr_8e_07_dp_0_5_swati_100_southern_sotho_false_fh_true_hs_0_en.md b/docs/_posts/ahmedlone127/2024-09-24-bert_base_uncased_ep_1_0_b_32_lr_8e_07_dp_0_5_swati_100_southern_sotho_false_fh_true_hs_0_en.md new file mode 100644 index 00000000000000..7bef16ed85913a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-bert_base_uncased_ep_1_0_b_32_lr_8e_07_dp_0_5_swati_100_southern_sotho_false_fh_true_hs_0_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bert_base_uncased_ep_1_0_b_32_lr_8e_07_dp_0_5_swati_100_southern_sotho_false_fh_true_hs_0 BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_ep_1_0_b_32_lr_8e_07_dp_0_5_swati_100_southern_sotho_false_fh_true_hs_0 +date: 2024-09-24 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_ep_1_0_b_32_lr_8e_07_dp_0_5_swati_100_southern_sotho_false_fh_true_hs_0` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_ep_1_0_b_32_lr_8e_07_dp_0_5_swati_100_southern_sotho_false_fh_true_hs_0_en_5.5.0_3.0_1727176161506.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_ep_1_0_b_32_lr_8e_07_dp_0_5_swati_100_southern_sotho_false_fh_true_hs_0_en_5.5.0_3.0_1727176161506.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_ep_1_0_b_32_lr_8e_07_dp_0_5_swati_100_southern_sotho_false_fh_true_hs_0","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_ep_1_0_b_32_lr_8e_07_dp_0_5_swati_100_southern_sotho_false_fh_true_hs_0", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_ep_1_0_b_32_lr_8e_07_dp_0_5_swati_100_southern_sotho_false_fh_true_hs_0| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-ep-1.0-b-32-lr-8e-07-dp-0.5-ss-100-st-False-fh-True-hs-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-bert_base_uncased_ep_2_25_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_true_fh_false_hs_0_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-24-bert_base_uncased_ep_2_25_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_true_fh_false_hs_0_pipeline_en.md new file mode 100644 index 00000000000000..87c7de51c3ad06 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-bert_base_uncased_ep_2_25_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_true_fh_false_hs_0_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bert_base_uncased_ep_2_25_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_true_fh_false_hs_0_pipeline pipeline BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_ep_2_25_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_true_fh_false_hs_0_pipeline +date: 2024-09-24 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_ep_2_25_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_true_fh_false_hs_0_pipeline` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_ep_2_25_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_true_fh_false_hs_0_pipeline_en_5.5.0_3.0_1727175948127.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_ep_2_25_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_true_fh_false_hs_0_pipeline_en_5.5.0_3.0_1727175948127.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_ep_2_25_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_true_fh_false_hs_0_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_ep_2_25_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_true_fh_false_hs_0_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_ep_2_25_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_true_fh_false_hs_0_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-ep-2.25-b-32-lr-8e-07-dp-0.5-ss-0-st-True-fh-False-hs-0 + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_2_swati_9119_southern_sotho_false_fh_true_hs_666_en.md b/docs/_posts/ahmedlone127/2024-09-24-bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_2_swati_9119_southern_sotho_false_fh_true_hs_666_en.md new file mode 100644 index 00000000000000..1fcf42cb485518 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_2_swati_9119_southern_sotho_false_fh_true_hs_666_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_2_swati_9119_southern_sotho_false_fh_true_hs_666 BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_2_swati_9119_southern_sotho_false_fh_true_hs_666 +date: 2024-09-24 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_2_swati_9119_southern_sotho_false_fh_true_hs_666` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_2_swati_9119_southern_sotho_false_fh_true_hs_666_en_5.5.0_3.0_1727175671783.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_2_swati_9119_southern_sotho_false_fh_true_hs_666_en_5.5.0_3.0_1727175671783.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_2_swati_9119_southern_sotho_false_fh_true_hs_666","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_2_swati_9119_southern_sotho_false_fh_true_hs_666", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_2_swati_9119_southern_sotho_false_fh_true_hs_666| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-finetune-squad-ep-1.0-lr-1e-05-wd-0.001-dp-0.2-ss-9119-st-False-fh-True-hs-666 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-bert_base_uncased_finetune_squad_ep_2_0_lr_0_0001_wd_0_001_dp_0_4_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-24-bert_base_uncased_finetune_squad_ep_2_0_lr_0_0001_wd_0_001_dp_0_4_pipeline_en.md new file mode 100644 index 00000000000000..7a0f926ec7269e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-bert_base_uncased_finetune_squad_ep_2_0_lr_0_0001_wd_0_001_dp_0_4_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bert_base_uncased_finetune_squad_ep_2_0_lr_0_0001_wd_0_001_dp_0_4_pipeline pipeline BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_finetune_squad_ep_2_0_lr_0_0001_wd_0_001_dp_0_4_pipeline +date: 2024-09-24 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetune_squad_ep_2_0_lr_0_0001_wd_0_001_dp_0_4_pipeline` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_2_0_lr_0_0001_wd_0_001_dp_0_4_pipeline_en_5.5.0_3.0_1727175371795.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_2_0_lr_0_0001_wd_0_001_dp_0_4_pipeline_en_5.5.0_3.0_1727175371795.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetune_squad_ep_2_0_lr_0_0001_wd_0_001_dp_0_4_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetune_squad_ep_2_0_lr_0_0001_wd_0_001_dp_0_4_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetune_squad_ep_2_0_lr_0_0001_wd_0_001_dp_0_4_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-finetune-squad-ep-2.0-lr-0.0001-wd-0.001-dp-0.4 + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-bert_base_uncased_finetune_squad_ep_2_0_lr_4e_06_wd_0_01_dp_0_2_swati_0_southern_sotho_true_fh_true_en.md b/docs/_posts/ahmedlone127/2024-09-24-bert_base_uncased_finetune_squad_ep_2_0_lr_4e_06_wd_0_01_dp_0_2_swati_0_southern_sotho_true_fh_true_en.md new file mode 100644 index 00000000000000..98a0efa6c25661 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-bert_base_uncased_finetune_squad_ep_2_0_lr_4e_06_wd_0_01_dp_0_2_swati_0_southern_sotho_true_fh_true_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bert_base_uncased_finetune_squad_ep_2_0_lr_4e_06_wd_0_01_dp_0_2_swati_0_southern_sotho_true_fh_true BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_finetune_squad_ep_2_0_lr_4e_06_wd_0_01_dp_0_2_swati_0_southern_sotho_true_fh_true +date: 2024-09-24 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetune_squad_ep_2_0_lr_4e_06_wd_0_01_dp_0_2_swati_0_southern_sotho_true_fh_true` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_2_0_lr_4e_06_wd_0_01_dp_0_2_swati_0_southern_sotho_true_fh_true_en_5.5.0_3.0_1727175503289.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_2_0_lr_4e_06_wd_0_01_dp_0_2_swati_0_southern_sotho_true_fh_true_en_5.5.0_3.0_1727175503289.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_2_0_lr_4e_06_wd_0_01_dp_0_2_swati_0_southern_sotho_true_fh_true","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_2_0_lr_4e_06_wd_0_01_dp_0_2_swati_0_southern_sotho_true_fh_true", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetune_squad_ep_2_0_lr_4e_06_wd_0_01_dp_0_2_swati_0_southern_sotho_true_fh_true| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-finetune-squad-ep-2.0-lr-4e-06-wd-0.01-dp-0.2-ss-0-st-True-fh-True \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-bert_finetuned_arxiv_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-24-bert_finetuned_arxiv_pipeline_en.md new file mode 100644 index 00000000000000..65a7ff1f900c89 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-bert_finetuned_arxiv_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_finetuned_arxiv_pipeline pipeline BertForSequenceClassification from AyoubChLin +author: John Snow Labs +name: bert_finetuned_arxiv_pipeline +date: 2024-09-24 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_arxiv_pipeline` is a English model originally trained by AyoubChLin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_arxiv_pipeline_en_5.5.0_3.0_1727222243474.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_arxiv_pipeline_en_5.5.0_3.0_1727222243474.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_finetuned_arxiv_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_finetuned_arxiv_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_arxiv_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/AyoubChLin/bert-finetuned-Arxiv + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-bert_gemma2b_multivllm_nodropsus_0_en.md b/docs/_posts/ahmedlone127/2024-09-24-bert_gemma2b_multivllm_nodropsus_0_en.md new file mode 100644 index 00000000000000..3832278a6699e0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-bert_gemma2b_multivllm_nodropsus_0_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_gemma2b_multivllm_nodropsus_0 DistilBertForSequenceClassification from jvelja +author: John Snow Labs +name: bert_gemma2b_multivllm_nodropsus_0 +date: 2024-09-24 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_gemma2b_multivllm_nodropsus_0` is a English model originally trained by jvelja. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_gemma2b_multivllm_nodropsus_0_en_5.5.0_3.0_1727164257551.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_gemma2b_multivllm_nodropsus_0_en_5.5.0_3.0_1727164257551.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_gemma2b_multivllm_nodropsus_0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_gemma2b_multivllm_nodropsus_0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_gemma2b_multivllm_nodropsus_0| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/jvelja/BERT_gemma2b-multivllm-NodropSus_0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-bert_persian_farsi_base_uncased_nlp_course_hw2_en.md b/docs/_posts/ahmedlone127/2024-09-24-bert_persian_farsi_base_uncased_nlp_course_hw2_en.md new file mode 100644 index 00000000000000..8f62591fcd500c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-bert_persian_farsi_base_uncased_nlp_course_hw2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_persian_farsi_base_uncased_nlp_course_hw2 BertEmbeddings from iMahdiGhazavi +author: John Snow Labs +name: bert_persian_farsi_base_uncased_nlp_course_hw2 +date: 2024-09-24 +tags: [en, open_source, onnx, embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_persian_farsi_base_uncased_nlp_course_hw2` is a English model originally trained by iMahdiGhazavi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_persian_farsi_base_uncased_nlp_course_hw2_en_5.5.0_3.0_1727161764913.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_persian_farsi_base_uncased_nlp_course_hw2_en_5.5.0_3.0_1727161764913.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = BertEmbeddings.pretrained("bert_persian_farsi_base_uncased_nlp_course_hw2","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = BertEmbeddings.pretrained("bert_persian_farsi_base_uncased_nlp_course_hw2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_persian_farsi_base_uncased_nlp_course_hw2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[bert]| +|Language:|en| +|Size:|605.8 MB| + +## References + +https://huggingface.co/iMahdiGhazavi/bert-fa-base-uncased-nlp-course-hw2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-biom_albert_xxlarge_en.md b/docs/_posts/ahmedlone127/2024-09-24-biom_albert_xxlarge_en.md new file mode 100644 index 00000000000000..7a29cf6a6357cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-biom_albert_xxlarge_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English biom_albert_xxlarge AlbertEmbeddings from sultan +author: John Snow Labs +name: biom_albert_xxlarge +date: 2024-09-24 +tags: [en, open_source, onnx, embeddings, albert] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: AlbertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained AlbertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biom_albert_xxlarge` is a English model originally trained by sultan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biom_albert_xxlarge_en_5.5.0_3.0_1727220219405.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biom_albert_xxlarge_en_5.5.0_3.0_1727220219405.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = AlbertEmbeddings.pretrained("biom_albert_xxlarge","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = AlbertEmbeddings.pretrained("biom_albert_xxlarge","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biom_albert_xxlarge| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[albert]| +|Language:|en| +|Size:|771.2 MB| + +## References + +https://huggingface.co/sultan/BioM-ALBERT-xxlarge \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-burmese_awesome_model_jfunk14_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-24-burmese_awesome_model_jfunk14_pipeline_en.md new file mode 100644 index 00000000000000..c848f23e13aacd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-burmese_awesome_model_jfunk14_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English burmese_awesome_model_jfunk14_pipeline pipeline DistilBertForSequenceClassification from jfunk14 +author: John Snow Labs +name: burmese_awesome_model_jfunk14_pipeline +date: 2024-09-24 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_model_jfunk14_pipeline` is a English model originally trained by jfunk14. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_jfunk14_pipeline_en_5.5.0_3.0_1727205045342.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_jfunk14_pipeline_en_5.5.0_3.0_1727205045342.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("burmese_awesome_model_jfunk14_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("burmese_awesome_model_jfunk14_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_model_jfunk14_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|250.1 MB| + +## References + +https://huggingface.co/jfunk14/my_awesome_model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-burmese_awesome_model_sid_pas_en.md b/docs/_posts/ahmedlone127/2024-09-24-burmese_awesome_model_sid_pas_en.md new file mode 100644 index 00000000000000..0ef8fe6529b310 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-burmese_awesome_model_sid_pas_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English burmese_awesome_model_sid_pas DistilBertForSequenceClassification from Sid-Pas +author: John Snow Labs +name: burmese_awesome_model_sid_pas +date: 2024-09-24 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_model_sid_pas` is a English model originally trained by Sid-Pas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_sid_pas_en_5.5.0_3.0_1727164589528.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_sid_pas_en_5.5.0_3.0_1727164589528.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_awesome_model_sid_pas","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_awesome_model_sid_pas", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_model_sid_pas| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Sid-Pas/my_awesome_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-burmese_awesome_text_classification_jeruan3_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-24-burmese_awesome_text_classification_jeruan3_pipeline_en.md new file mode 100644 index 00000000000000..2a0b0c6304b830 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-burmese_awesome_text_classification_jeruan3_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English burmese_awesome_text_classification_jeruan3_pipeline pipeline DistilBertForSequenceClassification from jeruan3 +author: John Snow Labs +name: burmese_awesome_text_classification_jeruan3_pipeline +date: 2024-09-24 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_text_classification_jeruan3_pipeline` is a English model originally trained by jeruan3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_text_classification_jeruan3_pipeline_en_5.5.0_3.0_1727154524293.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_text_classification_jeruan3_pipeline_en_5.5.0_3.0_1727154524293.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("burmese_awesome_text_classification_jeruan3_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("burmese_awesome_text_classification_jeruan3_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_text_classification_jeruan3_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/jeruan3/my-awesome-text-classification + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-cl_arabertv0_1_base_33379_arabic_tydiqa_en.md b/docs/_posts/ahmedlone127/2024-09-24-cl_arabertv0_1_base_33379_arabic_tydiqa_en.md new file mode 100644 index 00000000000000..aedaf44fffce52 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-cl_arabertv0_1_base_33379_arabic_tydiqa_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English cl_arabertv0_1_base_33379_arabic_tydiqa BertForQuestionAnswering from MatMulMan +author: John Snow Labs +name: cl_arabertv0_1_base_33379_arabic_tydiqa +date: 2024-09-24 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cl_arabertv0_1_base_33379_arabic_tydiqa` is a English model originally trained by MatMulMan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cl_arabertv0_1_base_33379_arabic_tydiqa_en_5.5.0_3.0_1727216828593.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cl_arabertv0_1_base_33379_arabic_tydiqa_en_5.5.0_3.0_1727216828593.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("cl_arabertv0_1_base_33379_arabic_tydiqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("cl_arabertv0_1_base_33379_arabic_tydiqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cl_arabertv0_1_base_33379_arabic_tydiqa| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|505.0 MB| + +## References + +https://huggingface.co/MatMulMan/CL-AraBERTv0.1-base-33379-arabic_tydiqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-clip_vit_large_patch14_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-24-clip_vit_large_patch14_pipeline_en.md new file mode 100644 index 00000000000000..6db48d6aacabcd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-clip_vit_large_patch14_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English clip_vit_large_patch14_pipeline pipeline CLIPForZeroShotClassification from openai +author: John Snow Labs +name: clip_vit_large_patch14_pipeline +date: 2024-09-24 +tags: [en, open_source, pipeline, onnx] +task: Zero-Shot Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CLIPForZeroShotClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clip_vit_large_patch14_pipeline` is a English model originally trained by openai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clip_vit_large_patch14_pipeline_en_5.5.0_3.0_1727208211453.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clip_vit_large_patch14_pipeline_en_5.5.0_3.0_1727208211453.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("clip_vit_large_patch14_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("clip_vit_large_patch14_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clip_vit_large_patch14_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.1 GB| + +## References + +https://huggingface.co/openai/clip-vit-large-patch14 + +## Included Models + +- ImageAssembler +- CLIPForZeroShotClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-db_mc2_4_1_en.md b/docs/_posts/ahmedlone127/2024-09-24-db_mc2_4_1_en.md new file mode 100644 index 00000000000000..097ff72ff61fb4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-db_mc2_4_1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English db_mc2_4_1 DistilBertForSequenceClassification from exala +author: John Snow Labs +name: db_mc2_4_1 +date: 2024-09-24 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`db_mc2_4_1` is a English model originally trained by exala. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/db_mc2_4_1_en_5.5.0_3.0_1727137572600.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/db_mc2_4_1_en_5.5.0_3.0_1727137572600.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("db_mc2_4_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("db_mc2_4_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|db_mc2_4_1| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.7 MB| + +## References + +https://huggingface.co/exala/db_mc2_4.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-db_mc2_4_4_en.md b/docs/_posts/ahmedlone127/2024-09-24-db_mc2_4_4_en.md new file mode 100644 index 00000000000000..08bf48dd790e6b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-db_mc2_4_4_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English db_mc2_4_4 DistilBertForSequenceClassification from exala +author: John Snow Labs +name: db_mc2_4_4 +date: 2024-09-24 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`db_mc2_4_4` is a English model originally trained by exala. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/db_mc2_4_4_en_5.5.0_3.0_1727204896035.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/db_mc2_4_4_en_5.5.0_3.0_1727204896035.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("db_mc2_4_4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("db_mc2_4_4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|db_mc2_4_4| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.7 MB| + +## References + +https://huggingface.co/exala/db_mc2_4.4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-disease_classifier_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-24-disease_classifier_pipeline_en.md new file mode 100644 index 00000000000000..cd60a26ad495a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-disease_classifier_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English disease_classifier_pipeline pipeline DistilBertForSequenceClassification from Amirth24 +author: John Snow Labs +name: disease_classifier_pipeline +date: 2024-09-24 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`disease_classifier_pipeline` is a English model originally trained by Amirth24. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/disease_classifier_pipeline_en_5.5.0_3.0_1727204916319.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/disease_classifier_pipeline_en_5.5.0_3.0_1727204916319.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("disease_classifier_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("disease_classifier_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|disease_classifier_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|252.6 MB| + +## References + +https://huggingface.co/Amirth24/disease_classifier + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-distilbert_base_uncased_distilled_clinc_omersubasi_en.md b/docs/_posts/ahmedlone127/2024-09-24-distilbert_base_uncased_distilled_clinc_omersubasi_en.md new file mode 100644 index 00000000000000..a976054acfdcc5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-distilbert_base_uncased_distilled_clinc_omersubasi_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_distilled_clinc_omersubasi DistilBertForSequenceClassification from omersubasi +author: John Snow Labs +name: distilbert_base_uncased_distilled_clinc_omersubasi +date: 2024-09-24 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_distilled_clinc_omersubasi` is a English model originally trained by omersubasi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_omersubasi_en_5.5.0_3.0_1727137156024.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_omersubasi_en_5.5.0_3.0_1727137156024.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_omersubasi","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_omersubasi", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_distilled_clinc_omersubasi| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/omersubasi/distilbert-base-uncased-distilled-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-distilbert_base_uncased_finetuned_emotion_jlsurdilla_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-24-distilbert_base_uncased_finetuned_emotion_jlsurdilla_pipeline_en.md new file mode 100644 index 00000000000000..6378e38ef727b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-distilbert_base_uncased_finetuned_emotion_jlsurdilla_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_jlsurdilla_pipeline pipeline DistilBertForSequenceClassification from jlsurdilla +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_jlsurdilla_pipeline +date: 2024-09-24 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_jlsurdilla_pipeline` is a English model originally trained by jlsurdilla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_jlsurdilla_pipeline_en_5.5.0_3.0_1727137178591.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_jlsurdilla_pipeline_en_5.5.0_3.0_1727137178591.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_emotion_jlsurdilla_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_emotion_jlsurdilla_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_jlsurdilla_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/jlsurdilla/distilbert-base-uncased-finetuned-emotion + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-distilbert_base_uncased_finetuned_en.md b/docs/_posts/ahmedlone127/2024-09-24-distilbert_base_uncased_finetuned_en.md new file mode 100644 index 00000000000000..25fa7d0e03e9b8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-distilbert_base_uncased_finetuned_en.md @@ -0,0 +1,92 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned DistilBertForQuestionAnswering from stig +author: John Snow Labs +name: distilbert_base_uncased_finetuned +date: 2024-09-24 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned` is a English model originally trained by stig. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_en_5.5.0_3.0_1727219908523.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_en_5.5.0_3.0_1727219908523.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) +``` +```scala +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +References + +https://huggingface.co/stig/distilbert-base-uncased-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-distilbert_base_uncased_finetuned_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-24-distilbert_base_uncased_finetuned_pipeline_en.md new file mode 100644 index 00000000000000..034dc2e87471be --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-distilbert_base_uncased_finetuned_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_pipeline pipeline DistilBertForQuestionAnswering from madhavpro3 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_pipeline +date: 2024-09-24 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_pipeline` is a English model originally trained by madhavpro3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_pipeline_en_5.5.0_3.0_1727219922010.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_pipeline_en_5.5.0_3.0_1727219922010.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/madhavpro3/distilbert-base-uncased-finetuned + +## Included Models + +- MultiDocumentAssembler +- DistilBertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-distilbert_sentiment_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-24-distilbert_sentiment_pipeline_en.md new file mode 100644 index 00000000000000..9843b6b6a145e8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-distilbert_sentiment_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_sentiment_pipeline pipeline DistilBertForSequenceClassification from actaylor +author: John Snow Labs +name: distilbert_sentiment_pipeline +date: 2024-09-24 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sentiment_pipeline` is a English model originally trained by actaylor. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sentiment_pipeline_en_5.5.0_3.0_1727136969741.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sentiment_pipeline_en_5.5.0_3.0_1727136969741.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_sentiment_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_sentiment_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sentiment_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/actaylor/distilbert-sentiment + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-distilbert_uncased_newsqa_en.md b/docs/_posts/ahmedlone127/2024-09-24-distilbert_uncased_newsqa_en.md new file mode 100644 index 00000000000000..b3d7f169a65000 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-distilbert_uncased_newsqa_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English distilbert_uncased_newsqa DistilBertForQuestionAnswering from Prasetyow12 +author: John Snow Labs +name: distilbert_uncased_newsqa +date: 2024-09-24 +tags: [en, open_source, onnx, question_answering, distilbert] +task: Question Answering +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_uncased_newsqa` is a English model originally trained by Prasetyow12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_uncased_newsqa_en_5.5.0_3.0_1727219902839.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_uncased_newsqa_en_5.5.0_3.0_1727219902839.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_uncased_newsqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_uncased_newsqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_uncased_newsqa| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Prasetyow12/distilbert-uncased-newsqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-distilroberta_base_finetuned_wikitext2_aekang12_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-24-distilroberta_base_finetuned_wikitext2_aekang12_pipeline_en.md new file mode 100644 index 00000000000000..40d8205cd63892 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-distilroberta_base_finetuned_wikitext2_aekang12_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilroberta_base_finetuned_wikitext2_aekang12_pipeline pipeline RoBertaEmbeddings from aekang12 +author: John Snow Labs +name: distilroberta_base_finetuned_wikitext2_aekang12_pipeline +date: 2024-09-24 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_base_finetuned_wikitext2_aekang12_pipeline` is a English model originally trained by aekang12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_base_finetuned_wikitext2_aekang12_pipeline_en_5.5.0_3.0_1727168685112.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_base_finetuned_wikitext2_aekang12_pipeline_en_5.5.0_3.0_1727168685112.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilroberta_base_finetuned_wikitext2_aekang12_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilroberta_base_finetuned_wikitext2_aekang12_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_base_finetuned_wikitext2_aekang12_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|306.5 MB| + +## References + +https://huggingface.co/aekang12/distilroberta-base-finetuned-wikitext2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-finetuning_sentiment_model_3000_kaggle_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-24-finetuning_sentiment_model_3000_kaggle_pipeline_en.md new file mode 100644 index 00000000000000..a37ccab726e036 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-finetuning_sentiment_model_3000_kaggle_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_kaggle_pipeline pipeline DistilBertForSequenceClassification from Munshid123 +author: John Snow Labs +name: finetuning_sentiment_model_3000_kaggle_pipeline +date: 2024-09-24 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_kaggle_pipeline` is a English model originally trained by Munshid123. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_kaggle_pipeline_en_5.5.0_3.0_1727154732516.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_kaggle_pipeline_en_5.5.0_3.0_1727154732516.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("finetuning_sentiment_model_3000_kaggle_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("finetuning_sentiment_model_3000_kaggle_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_kaggle_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Munshid123/finetuning-sentiment-model-3000-kaggle + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-joe_roberta_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-24-joe_roberta_pipeline_en.md new file mode 100644 index 00000000000000..815936a767bd53 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-joe_roberta_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English joe_roberta_pipeline pipeline RoBertaForSequenceClassification from Gikubu +author: John Snow Labs +name: joe_roberta_pipeline +date: 2024-09-24 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`joe_roberta_pipeline` is a English model originally trained by Gikubu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/joe_roberta_pipeline_en_5.5.0_3.0_1727167643996.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/joe_roberta_pipeline_en_5.5.0_3.0_1727167643996.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("joe_roberta_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("joe_roberta_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|joe_roberta_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|444.0 MB| + +## References + +https://huggingface.co/Gikubu/joe_roberta + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-malayalam_qa_model_ml.md b/docs/_posts/ahmedlone127/2024-09-24-malayalam_qa_model_ml.md new file mode 100644 index 00000000000000..b576b5217fdba6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-malayalam_qa_model_ml.md @@ -0,0 +1,86 @@ +--- +layout: model +title: Malayalam malayalam_qa_model BertForQuestionAnswering from Anitha2020 +author: John Snow Labs +name: malayalam_qa_model +date: 2024-09-24 +tags: [ml, open_source, onnx, question_answering, bert] +task: Question Answering +language: ml +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`malayalam_qa_model` is a Malayalam model originally trained by Anitha2020. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/malayalam_qa_model_ml_5.5.0_3.0_1727163188549.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/malayalam_qa_model_ml_5.5.0_3.0_1727163188549.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("malayalam_qa_model","ml") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("malayalam_qa_model", "ml") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|malayalam_qa_model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ml| +|Size:|890.5 MB| + +## References + +https://huggingface.co/Anitha2020/Malayalam_QA_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-memo_bert_wsd_memo_bert_danskbert_last_en.md b/docs/_posts/ahmedlone127/2024-09-24-memo_bert_wsd_memo_bert_danskbert_last_en.md new file mode 100644 index 00000000000000..9f632de7d2e357 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-memo_bert_wsd_memo_bert_danskbert_last_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English memo_bert_wsd_memo_bert_danskbert_last XlmRoBertaForSequenceClassification from yemen2016 +author: John Snow Labs +name: memo_bert_wsd_memo_bert_danskbert_last +date: 2024-09-24 +tags: [en, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`memo_bert_wsd_memo_bert_danskbert_last` is a English model originally trained by yemen2016. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/memo_bert_wsd_memo_bert_danskbert_last_en_5.5.0_3.0_1727155798955.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/memo_bert_wsd_memo_bert_danskbert_last_en_5.5.0_3.0_1727155798955.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("memo_bert_wsd_memo_bert_danskbert_last","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("memo_bert_wsd_memo_bert_danskbert_last", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|memo_bert_wsd_memo_bert_danskbert_last| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|428.3 MB| + +## References + +https://huggingface.co/yemen2016/MeMo_BERT-WSD-MeMo-BERT-DanskBERT_last \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-n_distilbert_sst5_padding0model_wyzhw_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-24-n_distilbert_sst5_padding0model_wyzhw_pipeline_en.md new file mode 100644 index 00000000000000..b78fcacb1f826e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-n_distilbert_sst5_padding0model_wyzhw_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English n_distilbert_sst5_padding0model_wyzhw_pipeline pipeline DistilBertForSequenceClassification from wyzhw +author: John Snow Labs +name: n_distilbert_sst5_padding0model_wyzhw_pipeline +date: 2024-09-24 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`n_distilbert_sst5_padding0model_wyzhw_pipeline` is a English model originally trained by wyzhw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/n_distilbert_sst5_padding0model_wyzhw_pipeline_en_5.5.0_3.0_1727136946418.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/n_distilbert_sst5_padding0model_wyzhw_pipeline_en_5.5.0_3.0_1727136946418.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("n_distilbert_sst5_padding0model_wyzhw_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("n_distilbert_sst5_padding0model_wyzhw_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|n_distilbert_sst5_padding0model_wyzhw_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/wyzhw/N_distilbert_sst5_padding0model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-nerubios_roberta_base_bne_training_testing_en.md b/docs/_posts/ahmedlone127/2024-09-24-nerubios_roberta_base_bne_training_testing_en.md new file mode 100644 index 00000000000000..7f458cb5810009 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-nerubios_roberta_base_bne_training_testing_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English nerubios_roberta_base_bne_training_testing RoBertaForTokenClassification from ajtamayoh +author: John Snow Labs +name: nerubios_roberta_base_bne_training_testing +date: 2024-09-24 +tags: [en, open_source, onnx, token_classification, roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nerubios_roberta_base_bne_training_testing` is a English model originally trained by ajtamayoh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nerubios_roberta_base_bne_training_testing_en_5.5.0_3.0_1727151553287.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nerubios_roberta_base_bne_training_testing_en_5.5.0_3.0_1727151553287.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = RoBertaForTokenClassification.pretrained("nerubios_roberta_base_bne_training_testing","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = RoBertaForTokenClassification.pretrained("nerubios_roberta_base_bne_training_testing", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nerubios_roberta_base_bne_training_testing| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|437.6 MB| + +## References + +https://huggingface.co/ajtamayoh/NeRUBioS_RoBERTa_base_bne_Training_Testing \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-portuguese_up_xlmr_contextincluded_idiomexcluded_4_best_en.md b/docs/_posts/ahmedlone127/2024-09-24-portuguese_up_xlmr_contextincluded_idiomexcluded_4_best_en.md new file mode 100644 index 00000000000000..1f6c9f0555d014 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-portuguese_up_xlmr_contextincluded_idiomexcluded_4_best_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English portuguese_up_xlmr_contextincluded_idiomexcluded_4_best XlmRoBertaForSequenceClassification from harish +author: John Snow Labs +name: portuguese_up_xlmr_contextincluded_idiomexcluded_4_best +date: 2024-09-24 +tags: [en, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`portuguese_up_xlmr_contextincluded_idiomexcluded_4_best` is a English model originally trained by harish. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/portuguese_up_xlmr_contextincluded_idiomexcluded_4_best_en_5.5.0_3.0_1727153414769.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/portuguese_up_xlmr_contextincluded_idiomexcluded_4_best_en_5.5.0_3.0_1727153414769.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("portuguese_up_xlmr_contextincluded_idiomexcluded_4_best","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("portuguese_up_xlmr_contextincluded_idiomexcluded_4_best", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|portuguese_up_xlmr_contextincluded_idiomexcluded_4_best| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|788.2 MB| + +## References + +https://huggingface.co/harish/PT-UP-xlmR-ContextIncluded_IdiomExcluded-4_BEST \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-roberta_base_epoch_24_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-24-roberta_base_epoch_24_pipeline_en.md new file mode 100644 index 00000000000000..c811e917c84157 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-roberta_base_epoch_24_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English roberta_base_epoch_24_pipeline pipeline RoBertaEmbeddings from yanaiela +author: John Snow Labs +name: roberta_base_epoch_24_pipeline +date: 2024-09-24 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_epoch_24_pipeline` is a English model originally trained by yanaiela. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_epoch_24_pipeline_en_5.5.0_3.0_1727169409285.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_epoch_24_pipeline_en_5.5.0_3.0_1727169409285.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("roberta_base_epoch_24_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("roberta_base_epoch_24_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_epoch_24_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|297.3 MB| + +## References + +https://huggingface.co/yanaiela/roberta-base-epoch_24 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-roberta_large_finetuned_ner_finetuned_ner_lionellow_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-24-roberta_large_finetuned_ner_finetuned_ner_lionellow_pipeline_en.md new file mode 100644 index 00000000000000..4e49e270edba09 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-roberta_large_finetuned_ner_finetuned_ner_lionellow_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English roberta_large_finetuned_ner_finetuned_ner_lionellow_pipeline pipeline RoBertaForTokenClassification from LionelLow +author: John Snow Labs +name: roberta_large_finetuned_ner_finetuned_ner_lionellow_pipeline +date: 2024-09-24 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_finetuned_ner_finetuned_ner_lionellow_pipeline` is a English model originally trained by LionelLow. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_finetuned_ner_finetuned_ner_lionellow_pipeline_en_5.5.0_3.0_1727150798825.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_finetuned_ner_finetuned_ner_lionellow_pipeline_en_5.5.0_3.0_1727150798825.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("roberta_large_finetuned_ner_finetuned_ner_lionellow_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("roberta_large_finetuned_ner_finetuned_ner_lionellow_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_finetuned_ner_finetuned_ner_lionellow_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/LionelLow/roberta-large-finetuned-ner-finetuned-ner + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-sent_bert_base_nli_stsb_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-24-sent_bert_base_nli_stsb_pipeline_en.md new file mode 100644 index 00000000000000..1c68eb09aead02 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-sent_bert_base_nli_stsb_pipeline_en.md @@ -0,0 +1,71 @@ +--- +layout: model +title: English sent_bert_base_nli_stsb_pipeline pipeline BertSentenceEmbeddings from binwang +author: John Snow Labs +name: sent_bert_base_nli_stsb_pipeline +date: 2024-09-24 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_bert_base_nli_stsb_pipeline` is a English model originally trained by binwang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_bert_base_nli_stsb_pipeline_en_5.5.0_3.0_1727202161554.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_bert_base_nli_stsb_pipeline_en_5.5.0_3.0_1727202161554.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_bert_base_nli_stsb_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_bert_base_nli_stsb_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_bert_base_nli_stsb_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.7 MB| + +## References + +https://huggingface.co/binwang/bert-base-nli-stsb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-sent_bert_multilingial_geolocation_prediction_en.md b/docs/_posts/ahmedlone127/2024-09-24-sent_bert_multilingial_geolocation_prediction_en.md new file mode 100644 index 00000000000000..ee0dd7e748a81a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-sent_bert_multilingial_geolocation_prediction_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sent_bert_multilingial_geolocation_prediction BertSentenceEmbeddings from k4tel +author: John Snow Labs +name: sent_bert_multilingial_geolocation_prediction +date: 2024-09-24 +tags: [en, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_bert_multilingial_geolocation_prediction` is a English model originally trained by k4tel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_bert_multilingial_geolocation_prediction_en_5.5.0_3.0_1727157360417.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_bert_multilingial_geolocation_prediction_en_5.5.0_3.0_1727157360417.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_bert_multilingial_geolocation_prediction","en") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_bert_multilingial_geolocation_prediction","en") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_bert_multilingial_geolocation_prediction| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|663.2 MB| + +## References + +https://huggingface.co/k4tel/bert-multilingial-geolocation-prediction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-sent_mi_bert_base_en.md b/docs/_posts/ahmedlone127/2024-09-24-sent_mi_bert_base_en.md new file mode 100644 index 00000000000000..f9cdbca9ede744 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-sent_mi_bert_base_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sent_mi_bert_base BertSentenceEmbeddings from mavinsao +author: John Snow Labs +name: sent_mi_bert_base +date: 2024-09-24 +tags: [en, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_mi_bert_base` is a English model originally trained by mavinsao. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_mi_bert_base_en_5.5.0_3.0_1727201981743.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_mi_bert_base_en_5.5.0_3.0_1727201981743.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_mi_bert_base","en") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_mi_bert_base","en") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_mi_bert_base| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|406.7 MB| + +## References + +https://huggingface.co/mavinsao/mi-bert-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-sent_mi_bert_base_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-24-sent_mi_bert_base_pipeline_en.md new file mode 100644 index 00000000000000..c76954355c43ec --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-sent_mi_bert_base_pipeline_en.md @@ -0,0 +1,71 @@ +--- +layout: model +title: English sent_mi_bert_base_pipeline pipeline BertSentenceEmbeddings from mavinsao +author: John Snow Labs +name: sent_mi_bert_base_pipeline +date: 2024-09-24 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_mi_bert_base_pipeline` is a English model originally trained by mavinsao. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_mi_bert_base_pipeline_en_5.5.0_3.0_1727202003330.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_mi_bert_base_pipeline_en_5.5.0_3.0_1727202003330.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_mi_bert_base_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_mi_bert_base_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_mi_bert_base_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.3 MB| + +## References + +https://huggingface.co/mavinsao/mi-bert-base + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-sst2_roberta_large_seed_1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-24-sst2_roberta_large_seed_1_pipeline_en.md new file mode 100644 index 00000000000000..eafa60e5e310ca --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-sst2_roberta_large_seed_1_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English sst2_roberta_large_seed_1_pipeline pipeline RoBertaForSequenceClassification from utahnlp +author: John Snow Labs +name: sst2_roberta_large_seed_1_pipeline +date: 2024-09-24 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sst2_roberta_large_seed_1_pipeline` is a English model originally trained by utahnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sst2_roberta_large_seed_1_pipeline_en_5.5.0_3.0_1727167946417.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sst2_roberta_large_seed_1_pipeline_en_5.5.0_3.0_1727167946417.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sst2_roberta_large_seed_1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sst2_roberta_large_seed_1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sst2_roberta_large_seed_1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/utahnlp/sst2_roberta-large_seed-1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-sucidal_text_classification_distillbert_en.md b/docs/_posts/ahmedlone127/2024-09-24-sucidal_text_classification_distillbert_en.md new file mode 100644 index 00000000000000..8fd62af918c593 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-sucidal_text_classification_distillbert_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sucidal_text_classification_distillbert DistilBertForSequenceClassification from pradanaadn +author: John Snow Labs +name: sucidal_text_classification_distillbert +date: 2024-09-24 +tags: [en, open_source, onnx, sequence_classification, distilbert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sucidal_text_classification_distillbert` is a English model originally trained by pradanaadn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sucidal_text_classification_distillbert_en_5.5.0_3.0_1727136820819.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sucidal_text_classification_distillbert_en_5.5.0_3.0_1727136820819.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("sucidal_text_classification_distillbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("sucidal_text_classification_distillbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sucidal_text_classification_distillbert| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/pradanaadn/sucidal-text-classification-distillbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-tmp0xmacdh7_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-24-tmp0xmacdh7_pipeline_en.md new file mode 100644 index 00000000000000..e6fbd10c23ea7e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-tmp0xmacdh7_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English tmp0xmacdh7_pipeline pipeline DistilBertForSequenceClassification from NikDiGio +author: John Snow Labs +name: tmp0xmacdh7_pipeline +date: 2024-09-24 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tmp0xmacdh7_pipeline` is a English model originally trained by NikDiGio. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tmp0xmacdh7_pipeline_en_5.5.0_3.0_1727154750495.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tmp0xmacdh7_pipeline_en_5.5.0_3.0_1727154750495.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("tmp0xmacdh7_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("tmp0xmacdh7_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tmp0xmacdh7_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/NikDiGio/tmp0xmacdh7 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-twitter_distilbert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-24-twitter_distilbert_pipeline_en.md new file mode 100644 index 00000000000000..1feea72789a917 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-twitter_distilbert_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English twitter_distilbert_pipeline pipeline DistilBertForSequenceClassification from alexray +author: John Snow Labs +name: twitter_distilbert_pipeline +date: 2024-09-24 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_distilbert_pipeline` is a English model originally trained by alexray. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_distilbert_pipeline_en_5.5.0_3.0_1727164772795.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_distilbert_pipeline_en_5.5.0_3.0_1727164772795.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("twitter_distilbert_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("twitter_distilbert_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_distilbert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/alexray/twitter-distilbert + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-twitter_roberta_base_en.md b/docs/_posts/ahmedlone127/2024-09-24-twitter_roberta_base_en.md new file mode 100644 index 00000000000000..3f21f83d7a1143 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-twitter_roberta_base_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English twitter_roberta_base RoBertaEmbeddings from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base +date: 2024-09-24 +tags: [en, open_source, onnx, embeddings, roberta] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_en_5.5.0_3.0_1727216050744.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_en_5.5.0_3.0_1727216050744.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = RoBertaEmbeddings.pretrained("twitter_roberta_base","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = RoBertaEmbeddings.pretrained("twitter_roberta_base","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[roberta]| +|Language:|en| +|Size:|465.9 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-whisper_small_portuguese_pedropauletti_pipeline_pt.md b/docs/_posts/ahmedlone127/2024-09-24-whisper_small_portuguese_pedropauletti_pipeline_pt.md new file mode 100644 index 00000000000000..62fcfbed4c54d2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-whisper_small_portuguese_pedropauletti_pipeline_pt.md @@ -0,0 +1,69 @@ +--- +layout: model +title: Portuguese whisper_small_portuguese_pedropauletti_pipeline pipeline WhisperForCTC from pedropauletti +author: John Snow Labs +name: whisper_small_portuguese_pedropauletti_pipeline +date: 2024-09-24 +tags: [pt, open_source, pipeline, onnx] +task: Automatic Speech Recognition +language: pt +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`whisper_small_portuguese_pedropauletti_pipeline` is a Portuguese model originally trained by pedropauletti. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/whisper_small_portuguese_pedropauletti_pipeline_pt_5.5.0_3.0_1727194290133.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/whisper_small_portuguese_pedropauletti_pipeline_pt_5.5.0_3.0_1727194290133.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("whisper_small_portuguese_pedropauletti_pipeline", lang = "pt") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("whisper_small_portuguese_pedropauletti_pipeline", lang = "pt") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|whisper_small_portuguese_pedropauletti_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|pt| +|Size:|1.7 GB| + +## References + +https://huggingface.co/pedropauletti/whisper-small-pt + +## Included Models + +- AudioAssembler +- WhisperForCTC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-xlm_roberta_base_finetuned_panx_all_amitjain171980_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-24-xlm_roberta_base_finetuned_panx_all_amitjain171980_pipeline_en.md new file mode 100644 index 00000000000000..4d429d5b845b9b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-xlm_roberta_base_finetuned_panx_all_amitjain171980_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_all_amitjain171980_pipeline pipeline XlmRoBertaForTokenClassification from amitjain171980 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_all_amitjain171980_pipeline +date: 2024-09-24 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_all_amitjain171980_pipeline` is a English model originally trained by amitjain171980. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_all_amitjain171980_pipeline_en_5.5.0_3.0_1727160372519.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_all_amitjain171980_pipeline_en_5.5.0_3.0_1727160372519.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_all_amitjain171980_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_all_amitjain171980_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_all_amitjain171980_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|856.0 MB| + +## References + +https://huggingface.co/amitjain171980/xlm-roberta-base-finetuned-panx-all + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-xlm_roberta_base_finetuned_panx_english_khadija267_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-24-xlm_roberta_base_finetuned_panx_english_khadija267_pipeline_en.md new file mode 100644 index 00000000000000..ecb4210e787e8c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-xlm_roberta_base_finetuned_panx_english_khadija267_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_english_khadija267_pipeline pipeline XlmRoBertaForTokenClassification from khadija267 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_english_khadija267_pipeline +date: 2024-09-24 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_english_khadija267_pipeline` is a English model originally trained by khadija267. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_english_khadija267_pipeline_en_5.5.0_3.0_1727160987979.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_english_khadija267_pipeline_en_5.5.0_3.0_1727160987979.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_english_khadija267_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_english_khadija267_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_english_khadija267_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|826.4 MB| + +## References + +https://huggingface.co/khadija267/xlm-roberta-base-finetuned-panx-en + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-24-xlm_roberta_base_russian_sentiment_liniscrowd_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-24-xlm_roberta_base_russian_sentiment_liniscrowd_pipeline_en.md new file mode 100644 index 00000000000000..c8143372c4c3e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-24-xlm_roberta_base_russian_sentiment_liniscrowd_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_russian_sentiment_liniscrowd_pipeline pipeline XlmRoBertaForSequenceClassification from sismetanin +author: John Snow Labs +name: xlm_roberta_base_russian_sentiment_liniscrowd_pipeline +date: 2024-09-24 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_russian_sentiment_liniscrowd_pipeline` is a English model originally trained by sismetanin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_russian_sentiment_liniscrowd_pipeline_en_5.5.0_3.0_1727152827634.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_russian_sentiment_liniscrowd_pipeline_en_5.5.0_3.0_1727152827634.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_russian_sentiment_liniscrowd_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_russian_sentiment_liniscrowd_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_russian_sentiment_liniscrowd_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|800.0 MB| + +## References + +https://huggingface.co/sismetanin/xlm_roberta_base-ru-sentiment-liniscrowd + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-0329files_zh.md b/docs/_posts/ahmedlone127/2024-09-25-0329files_zh.md new file mode 100644 index 00000000000000..7d75409943aec3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-0329files_zh.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Chinese 0329files BertForTokenClassification from sothisai1 +author: John Snow Labs +name: 0329files +date: 2024-09-25 +tags: [zh, open_source, onnx, token_classification, bert, ner] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`0329files` is a Chinese model originally trained by sothisai1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/0329files_zh_5.5.0_3.0_1727284130770.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/0329files_zh_5.5.0_3.0_1727284130770.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = BertForTokenClassification.pretrained("0329files","zh") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("0329files", "zh") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|0329files| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|406.1 MB| + +## References + +https://huggingface.co/sothisai1/0329files \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-20230817010018_en.md b/docs/_posts/ahmedlone127/2024-09-25-20230817010018_en.md new file mode 100644 index 00000000000000..85a3891c616d29 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-20230817010018_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English 20230817010018 BertForSequenceClassification from Onutoa +author: John Snow Labs +name: 20230817010018 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`20230817010018` is a English model originally trained by Onutoa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/20230817010018_en_5.5.0_3.0_1727290852517.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/20230817010018_en_5.5.0_3.0_1727290852517.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("20230817010018","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("20230817010018", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|20230817010018| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/Onutoa/20230817010018 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-20230817010018_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-20230817010018_pipeline_en.md new file mode 100644 index 00000000000000..459d3165a50e66 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-20230817010018_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English 20230817010018_pipeline pipeline BertForSequenceClassification from Onutoa +author: John Snow Labs +name: 20230817010018_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`20230817010018_pipeline` is a English model originally trained by Onutoa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/20230817010018_pipeline_en_5.5.0_3.0_1727290920029.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/20230817010018_pipeline_en_5.5.0_3.0_1727290920029.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("20230817010018_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("20230817010018_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|20230817010018_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/Onutoa/20230817010018 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-2d_oomv2_800_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-2d_oomv2_800_pipeline_en.md new file mode 100644 index 00000000000000..4d2984b0e560b8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-2d_oomv2_800_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English 2d_oomv2_800_pipeline pipeline BertForSequenceClassification from abbassix +author: John Snow Labs +name: 2d_oomv2_800_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`2d_oomv2_800_pipeline` is a English model originally trained by abbassix. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/2d_oomv2_800_pipeline_en_5.5.0_3.0_1727288396946.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/2d_oomv2_800_pipeline_en_5.5.0_3.0_1727288396946.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("2d_oomv2_800_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("2d_oomv2_800_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|2d_oomv2_800_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/abbassix/2d_oomv2_800 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-aak_bert_base_cased_cpc_ricardo_talavera_en.md b/docs/_posts/ahmedlone127/2024-09-25-aak_bert_base_cased_cpc_ricardo_talavera_en.md new file mode 100644 index 00000000000000..aaf9973106166d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-aak_bert_base_cased_cpc_ricardo_talavera_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English aak_bert_base_cased_cpc_ricardo_talavera BertForSequenceClassification from nishan007 +author: John Snow Labs +name: aak_bert_base_cased_cpc_ricardo_talavera +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`aak_bert_base_cased_cpc_ricardo_talavera` is a English model originally trained by nishan007. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/aak_bert_base_cased_cpc_ricardo_talavera_en_5.5.0_3.0_1727286358287.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/aak_bert_base_cased_cpc_ricardo_talavera_en_5.5.0_3.0_1727286358287.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("aak_bert_base_cased_cpc_ricardo_talavera","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("aak_bert_base_cased_cpc_ricardo_talavera", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|aak_bert_base_cased_cpc_ricardo_talavera| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/nishan007/aak-bert-base-cased-cpc-ricardo-talavera \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-acronyms_baseline_vert_correct_clinicalbert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-acronyms_baseline_vert_correct_clinicalbert_pipeline_en.md new file mode 100644 index 00000000000000..2ed3868fcad9da --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-acronyms_baseline_vert_correct_clinicalbert_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English acronyms_baseline_vert_correct_clinicalbert_pipeline pipeline BertForSequenceClassification from Wiggily +author: John Snow Labs +name: acronyms_baseline_vert_correct_clinicalbert_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`acronyms_baseline_vert_correct_clinicalbert_pipeline` is a English model originally trained by Wiggily. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/acronyms_baseline_vert_correct_clinicalbert_pipeline_en_5.5.0_3.0_1727245413365.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/acronyms_baseline_vert_correct_clinicalbert_pipeline_en_5.5.0_3.0_1727245413365.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("acronyms_baseline_vert_correct_clinicalbert_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("acronyms_baseline_vert_correct_clinicalbert_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|acronyms_baseline_vert_correct_clinicalbert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.6 MB| + +## References + +https://huggingface.co/Wiggily/acronyms_baseline_vert_correct_clinicalbert + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-ad_kd_bert_base_uncased_mrpc_en.md b/docs/_posts/ahmedlone127/2024-09-25-ad_kd_bert_base_uncased_mrpc_en.md new file mode 100644 index 00000000000000..9b8a27c91ed1b6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-ad_kd_bert_base_uncased_mrpc_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English ad_kd_bert_base_uncased_mrpc BertForSequenceClassification from Brucewsy +author: John Snow Labs +name: ad_kd_bert_base_uncased_mrpc +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ad_kd_bert_base_uncased_mrpc` is a English model originally trained by Brucewsy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ad_kd_bert_base_uncased_mrpc_en_5.5.0_3.0_1727285812732.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ad_kd_bert_base_uncased_mrpc_en_5.5.0_3.0_1727285812732.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("ad_kd_bert_base_uncased_mrpc","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("ad_kd_bert_base_uncased_mrpc", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ad_kd_bert_base_uncased_mrpc| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Brucewsy/AD-KD_bert_base_uncased_mrpc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-ad_kd_bert_base_uncased_qqp_en.md b/docs/_posts/ahmedlone127/2024-09-25-ad_kd_bert_base_uncased_qqp_en.md new file mode 100644 index 00000000000000..3cf1540c2ecfb7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-ad_kd_bert_base_uncased_qqp_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English ad_kd_bert_base_uncased_qqp BertForSequenceClassification from Brucewsy +author: John Snow Labs +name: ad_kd_bert_base_uncased_qqp +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ad_kd_bert_base_uncased_qqp` is a English model originally trained by Brucewsy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ad_kd_bert_base_uncased_qqp_en_5.5.0_3.0_1727291311427.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ad_kd_bert_base_uncased_qqp_en_5.5.0_3.0_1727291311427.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("ad_kd_bert_base_uncased_qqp","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("ad_kd_bert_base_uncased_qqp", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ad_kd_bert_base_uncased_qqp| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Brucewsy/AD-KD_bert_base_uncased_qqp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-ad_kd_bert_base_uncased_qqp_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-ad_kd_bert_base_uncased_qqp_pipeline_en.md new file mode 100644 index 00000000000000..19b144a83a7984 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-ad_kd_bert_base_uncased_qqp_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English ad_kd_bert_base_uncased_qqp_pipeline pipeline BertForSequenceClassification from Brucewsy +author: John Snow Labs +name: ad_kd_bert_base_uncased_qqp_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ad_kd_bert_base_uncased_qqp_pipeline` is a English model originally trained by Brucewsy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ad_kd_bert_base_uncased_qqp_pipeline_en_5.5.0_3.0_1727291333754.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ad_kd_bert_base_uncased_qqp_pipeline_en_5.5.0_3.0_1727291333754.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("ad_kd_bert_base_uncased_qqp_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("ad_kd_bert_base_uncased_qqp_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ad_kd_bert_base_uncased_qqp_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Brucewsy/AD-KD_bert_base_uncased_qqp + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-adapmit_multilabel_bge_f_en.md b/docs/_posts/ahmedlone127/2024-09-25-adapmit_multilabel_bge_f_en.md new file mode 100644 index 00000000000000..3a3fc3d17fd075 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-adapmit_multilabel_bge_f_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English adapmit_multilabel_bge_f BertForSequenceClassification from GIZ +author: John Snow Labs +name: adapmit_multilabel_bge_f +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`adapmit_multilabel_bge_f` is a English model originally trained by GIZ. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/adapmit_multilabel_bge_f_en_5.5.0_3.0_1727289580372.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/adapmit_multilabel_bge_f_en_5.5.0_3.0_1727289580372.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("adapmit_multilabel_bge_f","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("adapmit_multilabel_bge_f", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|adapmit_multilabel_bge_f| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|396.7 MB| + +## References + +https://huggingface.co/GIZ/ADAPMIT-multilabel-bge_f \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-adapmit_multilabel_bge_f_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-adapmit_multilabel_bge_f_pipeline_en.md new file mode 100644 index 00000000000000..8bea00ad127003 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-adapmit_multilabel_bge_f_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English adapmit_multilabel_bge_f_pipeline pipeline BertForSequenceClassification from GIZ +author: John Snow Labs +name: adapmit_multilabel_bge_f_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`adapmit_multilabel_bge_f_pipeline` is a English model originally trained by GIZ. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/adapmit_multilabel_bge_f_pipeline_en_5.5.0_3.0_1727289602886.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/adapmit_multilabel_bge_f_pipeline_en_5.5.0_3.0_1727289602886.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("adapmit_multilabel_bge_f_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("adapmit_multilabel_bge_f_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|adapmit_multilabel_bge_f_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|396.7 MB| + +## References + +https://huggingface.co/GIZ/ADAPMIT-multilabel-bge_f + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-advance_bert_classification_en.md b/docs/_posts/ahmedlone127/2024-09-25-advance_bert_classification_en.md new file mode 100644 index 00000000000000..db89c5ba81d0fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-advance_bert_classification_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English advance_bert_classification BertForSequenceClassification from Kurkur99 +author: John Snow Labs +name: advance_bert_classification +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`advance_bert_classification` is a English model originally trained by Kurkur99. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/advance_bert_classification_en_5.5.0_3.0_1727269930905.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/advance_bert_classification_en_5.5.0_3.0_1727269930905.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("advance_bert_classification","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("advance_bert_classification", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|advance_bert_classification| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.1 MB| + +## References + +https://huggingface.co/Kurkur99/Advance_Bert_Classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-ag_news_19200_bert_base_uncased_en.md b/docs/_posts/ahmedlone127/2024-09-25-ag_news_19200_bert_base_uncased_en.md new file mode 100644 index 00000000000000..7f0caa1e93b050 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-ag_news_19200_bert_base_uncased_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English ag_news_19200_bert_base_uncased BertForSequenceClassification from Kyle1668 +author: John Snow Labs +name: ag_news_19200_bert_base_uncased +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ag_news_19200_bert_base_uncased` is a English model originally trained by Kyle1668. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ag_news_19200_bert_base_uncased_en_5.5.0_3.0_1727299604856.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ag_news_19200_bert_base_uncased_en_5.5.0_3.0_1727299604856.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("ag_news_19200_bert_base_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("ag_news_19200_bert_base_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ag_news_19200_bert_base_uncased| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Kyle1668/ag-news-19200-bert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-ag_news_19200_bert_base_uncased_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-ag_news_19200_bert_base_uncased_pipeline_en.md new file mode 100644 index 00000000000000..55ac060d1237cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-ag_news_19200_bert_base_uncased_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English ag_news_19200_bert_base_uncased_pipeline pipeline BertForSequenceClassification from Kyle1668 +author: John Snow Labs +name: ag_news_19200_bert_base_uncased_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ag_news_19200_bert_base_uncased_pipeline` is a English model originally trained by Kyle1668. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ag_news_19200_bert_base_uncased_pipeline_en_5.5.0_3.0_1727299627759.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ag_news_19200_bert_base_uncased_pipeline_en_5.5.0_3.0_1727299627759.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("ag_news_19200_bert_base_uncased_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("ag_news_19200_bert_base_uncased_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ag_news_19200_bert_base_uncased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Kyle1668/ag-news-19200-bert-base-uncased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-ag_news_9600_bert_base_uncased_en.md b/docs/_posts/ahmedlone127/2024-09-25-ag_news_9600_bert_base_uncased_en.md new file mode 100644 index 00000000000000..922cf373d049aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-ag_news_9600_bert_base_uncased_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English ag_news_9600_bert_base_uncased BertForSequenceClassification from Kyle1668 +author: John Snow Labs +name: ag_news_9600_bert_base_uncased +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ag_news_9600_bert_base_uncased` is a English model originally trained by Kyle1668. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ag_news_9600_bert_base_uncased_en_5.5.0_3.0_1727300086729.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ag_news_9600_bert_base_uncased_en_5.5.0_3.0_1727300086729.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("ag_news_9600_bert_base_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("ag_news_9600_bert_base_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ag_news_9600_bert_base_uncased| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Kyle1668/ag-news-9600-bert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-ag_news_classification_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-ag_news_classification_pipeline_en.md new file mode 100644 index 00000000000000..c41855c214c5a8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-ag_news_classification_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English ag_news_classification_pipeline pipeline BertForSequenceClassification from shed-e +author: John Snow Labs +name: ag_news_classification_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ag_news_classification_pipeline` is a English model originally trained by shed-e. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ag_news_classification_pipeline_en_5.5.0_3.0_1727279460022.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ag_news_classification_pipeline_en_5.5.0_3.0_1727279460022.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("ag_news_classification_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("ag_news_classification_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ag_news_classification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|16.8 MB| + +## References + +https://huggingface.co/shed-e/ag_news-Classification + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-albert_base_chinese_finetuned_qqp_fhtm_5x_weak_en.md b/docs/_posts/ahmedlone127/2024-09-25-albert_base_chinese_finetuned_qqp_fhtm_5x_weak_en.md new file mode 100644 index 00000000000000..c5f2eded59055e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-albert_base_chinese_finetuned_qqp_fhtm_5x_weak_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English albert_base_chinese_finetuned_qqp_fhtm_5x_weak BertForSequenceClassification from r10521708 +author: John Snow Labs +name: albert_base_chinese_finetuned_qqp_fhtm_5x_weak +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`albert_base_chinese_finetuned_qqp_fhtm_5x_weak` is a English model originally trained by r10521708. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/albert_base_chinese_finetuned_qqp_fhtm_5x_weak_en_5.5.0_3.0_1727301436927.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/albert_base_chinese_finetuned_qqp_fhtm_5x_weak_en_5.5.0_3.0_1727301436927.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("albert_base_chinese_finetuned_qqp_fhtm_5x_weak","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("albert_base_chinese_finetuned_qqp_fhtm_5x_weak", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|albert_base_chinese_finetuned_qqp_fhtm_5x_weak| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|39.8 MB| + +## References + +https://huggingface.co/r10521708/albert-base-chinese-finetuned-qqp-FHTM-5x-weak \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-albert_base_chinese_finetuned_qqp_fhtm_5x_weak_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-albert_base_chinese_finetuned_qqp_fhtm_5x_weak_pipeline_en.md new file mode 100644 index 00000000000000..980705187fc172 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-albert_base_chinese_finetuned_qqp_fhtm_5x_weak_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English albert_base_chinese_finetuned_qqp_fhtm_5x_weak_pipeline pipeline BertForSequenceClassification from r10521708 +author: John Snow Labs +name: albert_base_chinese_finetuned_qqp_fhtm_5x_weak_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`albert_base_chinese_finetuned_qqp_fhtm_5x_weak_pipeline` is a English model originally trained by r10521708. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/albert_base_chinese_finetuned_qqp_fhtm_5x_weak_pipeline_en_5.5.0_3.0_1727301439790.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/albert_base_chinese_finetuned_qqp_fhtm_5x_weak_pipeline_en_5.5.0_3.0_1727301439790.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("albert_base_chinese_finetuned_qqp_fhtm_5x_weak_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("albert_base_chinese_finetuned_qqp_fhtm_5x_weak_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|albert_base_chinese_finetuned_qqp_fhtm_5x_weak_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|39.8 MB| + +## References + +https://huggingface.co/r10521708/albert-base-chinese-finetuned-qqp-FHTM-5x-weak + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-albert_base_chinese_finetuned_test_qqp_en.md b/docs/_posts/ahmedlone127/2024-09-25-albert_base_chinese_finetuned_test_qqp_en.md new file mode 100644 index 00000000000000..3f5edbc0cf2041 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-albert_base_chinese_finetuned_test_qqp_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English albert_base_chinese_finetuned_test_qqp BertForSequenceClassification from r10521708 +author: John Snow Labs +name: albert_base_chinese_finetuned_test_qqp +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`albert_base_chinese_finetuned_test_qqp` is a English model originally trained by r10521708. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/albert_base_chinese_finetuned_test_qqp_en_5.5.0_3.0_1727305260524.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/albert_base_chinese_finetuned_test_qqp_en_5.5.0_3.0_1727305260524.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("albert_base_chinese_finetuned_test_qqp","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("albert_base_chinese_finetuned_test_qqp", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|albert_base_chinese_finetuned_test_qqp| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|39.8 MB| + +## References + +https://huggingface.co/r10521708/albert-base-chinese-finetuned-test-qqp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-albert_chinese_small_sentiment_en.md b/docs/_posts/ahmedlone127/2024-09-25-albert_chinese_small_sentiment_en.md new file mode 100644 index 00000000000000..6c03013610f78e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-albert_chinese_small_sentiment_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English albert_chinese_small_sentiment BertForSequenceClassification from voidful +author: John Snow Labs +name: albert_chinese_small_sentiment +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`albert_chinese_small_sentiment` is a English model originally trained by voidful. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/albert_chinese_small_sentiment_en_5.5.0_3.0_1727295462689.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/albert_chinese_small_sentiment_en_5.5.0_3.0_1727295462689.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("albert_chinese_small_sentiment","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("albert_chinese_small_sentiment", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|albert_chinese_small_sentiment| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|18.0 MB| + +## References + +https://huggingface.co/voidful/albert_chinese_small_sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-albert_chinese_small_sentiment_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-albert_chinese_small_sentiment_pipeline_en.md new file mode 100644 index 00000000000000..3811e16af6eac9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-albert_chinese_small_sentiment_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English albert_chinese_small_sentiment_pipeline pipeline BertForSequenceClassification from voidful +author: John Snow Labs +name: albert_chinese_small_sentiment_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`albert_chinese_small_sentiment_pipeline` is a English model originally trained by voidful. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/albert_chinese_small_sentiment_pipeline_en_5.5.0_3.0_1727295464029.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/albert_chinese_small_sentiment_pipeline_en_5.5.0_3.0_1727295464029.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("albert_chinese_small_sentiment_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("albert_chinese_small_sentiment_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|albert_chinese_small_sentiment_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|18.0 MB| + +## References + +https://huggingface.co/voidful/albert_chinese_small_sentiment + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-alberti128b128l_en.md b/docs/_posts/ahmedlone127/2024-09-25-alberti128b128l_en.md new file mode 100644 index 00000000000000..c39a6b9e4434f0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-alberti128b128l_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English alberti128b128l BertForSequenceClassification from IvashinMaxim +author: John Snow Labs +name: alberti128b128l +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`alberti128b128l` is a English model originally trained by IvashinMaxim. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/alberti128b128l_en_5.5.0_3.0_1727297577508.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/alberti128b128l_en_5.5.0_3.0_1727297577508.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("alberti128b128l","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("alberti128b128l", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|alberti128b128l| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|666.6 MB| + +## References + +https://huggingface.co/IvashinMaxim/alberti128b128l \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-arabert_emotions_classification_en.md b/docs/_posts/ahmedlone127/2024-09-25-arabert_emotions_classification_en.md new file mode 100644 index 00000000000000..6ec5e3ea7078f2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-arabert_emotions_classification_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English arabert_emotions_classification BertForSequenceClassification from Yousefmd +author: John Snow Labs +name: arabert_emotions_classification +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`arabert_emotions_classification` is a English model originally trained by Yousefmd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/arabert_emotions_classification_en_5.5.0_3.0_1727291075245.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/arabert_emotions_classification_en_5.5.0_3.0_1727291075245.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("arabert_emotions_classification","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("arabert_emotions_classification", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|arabert_emotions_classification| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.4 GB| + +## References + +https://huggingface.co/Yousefmd/arabert-emotions-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-arabert_emotions_classification_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-arabert_emotions_classification_pipeline_en.md new file mode 100644 index 00000000000000..b85a99afdde2e8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-arabert_emotions_classification_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English arabert_emotions_classification_pipeline pipeline BertForSequenceClassification from Yousefmd +author: John Snow Labs +name: arabert_emotions_classification_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`arabert_emotions_classification_pipeline` is a English model originally trained by Yousefmd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/arabert_emotions_classification_pipeline_en_5.5.0_3.0_1727291146641.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/arabert_emotions_classification_pipeline_en_5.5.0_3.0_1727291146641.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("arabert_emotions_classification_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("arabert_emotions_classification_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|arabert_emotions_classification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.4 GB| + +## References + +https://huggingface.co/Yousefmd/arabert-emotions-classification + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-arabert_en.md b/docs/_posts/ahmedlone127/2024-09-25-arabert_en.md new file mode 100644 index 00000000000000..414556448d7fd0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-arabert_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English arabert BertForSequenceClassification from jaimin +author: John Snow Labs +name: arabert +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`arabert` is a English model originally trained by jaimin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/arabert_en_5.5.0_3.0_1727297144995.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/arabert_en_5.5.0_3.0_1727297144995.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("arabert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("arabert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|arabert| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|507.3 MB| + +## References + +https://huggingface.co/jaimin/AraBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-arabert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-arabert_pipeline_en.md new file mode 100644 index 00000000000000..b802afd484da05 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-arabert_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English arabert_pipeline pipeline BertForSequenceClassification from jaimin +author: John Snow Labs +name: arabert_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`arabert_pipeline` is a English model originally trained by jaimin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/arabert_pipeline_en_5.5.0_3.0_1727297172662.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/arabert_pipeline_en_5.5.0_3.0_1727297172662.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("arabert_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("arabert_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|arabert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|507.3 MB| + +## References + +https://huggingface.co/jaimin/AraBERT + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-autonlp_vaccinfaq_22144706_en.md b/docs/_posts/ahmedlone127/2024-09-25-autonlp_vaccinfaq_22144706_en.md new file mode 100644 index 00000000000000..4dc79da8b08d5d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-autonlp_vaccinfaq_22144706_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English autonlp_vaccinfaq_22144706 BertForSequenceClassification from Jeska +author: John Snow Labs +name: autonlp_vaccinfaq_22144706 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autonlp_vaccinfaq_22144706` is a English model originally trained by Jeska. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autonlp_vaccinfaq_22144706_en_5.5.0_3.0_1727304276113.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autonlp_vaccinfaq_22144706_en_5.5.0_3.0_1727304276113.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("autonlp_vaccinfaq_22144706","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("autonlp_vaccinfaq_22144706", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autonlp_vaccinfaq_22144706| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/Jeska/autonlp-vaccinfaq-22144706 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-autonlp_vaccinfaq_22144706_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-autonlp_vaccinfaq_22144706_pipeline_en.md new file mode 100644 index 00000000000000..6eb1d8ad6708cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-autonlp_vaccinfaq_22144706_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English autonlp_vaccinfaq_22144706_pipeline pipeline BertForSequenceClassification from Jeska +author: John Snow Labs +name: autonlp_vaccinfaq_22144706_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autonlp_vaccinfaq_22144706_pipeline` is a English model originally trained by Jeska. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autonlp_vaccinfaq_22144706_pipeline_en_5.5.0_3.0_1727304298178.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autonlp_vaccinfaq_22144706_pipeline_en_5.5.0_3.0_1727304298178.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("autonlp_vaccinfaq_22144706_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("autonlp_vaccinfaq_22144706_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autonlp_vaccinfaq_22144706_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/Jeska/autonlp-vaccinfaq-22144706 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-autotrain_bertbase_imdb_1275748792_en.md b/docs/_posts/ahmedlone127/2024-09-25-autotrain_bertbase_imdb_1275748792_en.md new file mode 100644 index 00000000000000..8f2a66531cec33 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-autotrain_bertbase_imdb_1275748792_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English autotrain_bertbase_imdb_1275748792 BertForSequenceClassification from sasha +author: John Snow Labs +name: autotrain_bertbase_imdb_1275748792 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_bertbase_imdb_1275748792` is a English model originally trained by sasha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_bertbase_imdb_1275748792_en_5.5.0_3.0_1727277463938.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_bertbase_imdb_1275748792_en_5.5.0_3.0_1727277463938.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("autotrain_bertbase_imdb_1275748792","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("autotrain_bertbase_imdb_1275748792", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_bertbase_imdb_1275748792| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/sasha/autotrain-BERTBase-imdb-1275748792 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-autotrain_chuvash_resume_56492130967_en.md b/docs/_posts/ahmedlone127/2024-09-25-autotrain_chuvash_resume_56492130967_en.md new file mode 100644 index 00000000000000..403cfcef7c5691 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-autotrain_chuvash_resume_56492130967_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English autotrain_chuvash_resume_56492130967 BertForSequenceClassification from guriko +author: John Snow Labs +name: autotrain_chuvash_resume_56492130967 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_chuvash_resume_56492130967` is a English model originally trained by guriko. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_chuvash_resume_56492130967_en_5.5.0_3.0_1727288128765.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_chuvash_resume_56492130967_en_5.5.0_3.0_1727288128765.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("autotrain_chuvash_resume_56492130967","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("autotrain_chuvash_resume_56492130967", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_chuvash_resume_56492130967| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/guriko/autotrain-cv_resume-56492130967 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-baseline_bert_50k_steps_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-baseline_bert_50k_steps_pipeline_en.md new file mode 100644 index 00000000000000..fa2fda4960bccf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-baseline_bert_50k_steps_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English baseline_bert_50k_steps_pipeline pipeline BertForSequenceClassification from jordyvl +author: John Snow Labs +name: baseline_bert_50k_steps_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`baseline_bert_50k_steps_pipeline` is a English model originally trained by jordyvl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/baseline_bert_50k_steps_pipeline_en_5.5.0_3.0_1727304576476.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/baseline_bert_50k_steps_pipeline_en_5.5.0_3.0_1727304576476.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("baseline_bert_50k_steps_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("baseline_bert_50k_steps_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|baseline_bert_50k_steps_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/jordyvl/baseline_BERT_50K_steps + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bcms_bertic_parlasent_bcs_bislama_hr.md b/docs/_posts/ahmedlone127/2024-09-25-bcms_bertic_parlasent_bcs_bislama_hr.md new file mode 100644 index 00000000000000..11f85cc4aeac33 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bcms_bertic_parlasent_bcs_bislama_hr.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Croatian bcms_bertic_parlasent_bcs_bislama BertForSequenceClassification from classla +author: John Snow Labs +name: bcms_bertic_parlasent_bcs_bislama +date: 2024-09-25 +tags: [hr, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: hr +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bcms_bertic_parlasent_bcs_bislama` is a Croatian model originally trained by classla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bcms_bertic_parlasent_bcs_bislama_hr_5.5.0_3.0_1727301525886.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bcms_bertic_parlasent_bcs_bislama_hr_5.5.0_3.0_1727301525886.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bcms_bertic_parlasent_bcs_bislama","hr") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bcms_bertic_parlasent_bcs_bislama", "hr") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bcms_bertic_parlasent_bcs_bislama| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|hr| +|Size:|414.9 MB| + +## References + +https://huggingface.co/classla/bcms-bertic-parlasent-bcs-bi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bengali_topic_all_doc_bn.md b/docs/_posts/ahmedlone127/2024-09-25-bengali_topic_all_doc_bn.md new file mode 100644 index 00000000000000..c5a1a0305774b2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bengali_topic_all_doc_bn.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Bengali bengali_topic_all_doc BertForSequenceClassification from l3cube-pune +author: John Snow Labs +name: bengali_topic_all_doc +date: 2024-09-25 +tags: [bn, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: bn +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bengali_topic_all_doc` is a Bengali model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bengali_topic_all_doc_bn_5.5.0_3.0_1727290564513.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bengali_topic_all_doc_bn_5.5.0_3.0_1727290564513.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bengali_topic_all_doc","bn") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bengali_topic_all_doc", "bn") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bengali_topic_all_doc| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|bn| +|Size:|892.8 MB| + +## References + +https://huggingface.co/l3cube-pune/bengali-topic-all-doc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_arabertv02_tydi_tafseer_pairs_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_arabertv02_tydi_tafseer_pairs_en.md new file mode 100644 index 00000000000000..0fcd66cf121af6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_arabertv02_tydi_tafseer_pairs_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_arabertv02_tydi_tafseer_pairs BertForSequenceClassification from MatMulMan +author: John Snow Labs +name: bert_base_arabertv02_tydi_tafseer_pairs +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_arabertv02_tydi_tafseer_pairs` is a English model originally trained by MatMulMan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_arabertv02_tydi_tafseer_pairs_en_5.5.0_3.0_1727297371015.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_arabertv02_tydi_tafseer_pairs_en_5.5.0_3.0_1727297371015.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_arabertv02_tydi_tafseer_pairs","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_arabertv02_tydi_tafseer_pairs", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_arabertv02_tydi_tafseer_pairs| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|507.3 MB| + +## References + +https://huggingface.co/MatMulMan/bert-base-arabertv02-tydi-tafseer-pairs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_arabertv02_tydi_tafseer_pairs_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_arabertv02_tydi_tafseer_pairs_pipeline_en.md new file mode 100644 index 00000000000000..1177b775d4b5e1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_arabertv02_tydi_tafseer_pairs_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_arabertv02_tydi_tafseer_pairs_pipeline pipeline BertForSequenceClassification from MatMulMan +author: John Snow Labs +name: bert_base_arabertv02_tydi_tafseer_pairs_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_arabertv02_tydi_tafseer_pairs_pipeline` is a English model originally trained by MatMulMan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_arabertv02_tydi_tafseer_pairs_pipeline_en_5.5.0_3.0_1727297398783.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_arabertv02_tydi_tafseer_pairs_pipeline_en_5.5.0_3.0_1727297398783.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_arabertv02_tydi_tafseer_pairs_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_arabertv02_tydi_tafseer_pairs_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_arabertv02_tydi_tafseer_pairs_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|507.3 MB| + +## References + +https://huggingface.co/MatMulMan/bert-base-arabertv02-tydi-tafseer-pairs + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_arabic_emotion_analysis_v3_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_arabic_emotion_analysis_v3_en.md new file mode 100644 index 00000000000000..5463d7d56e083b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_arabic_emotion_analysis_v3_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_arabic_emotion_analysis_v3 BertForSequenceClassification from alpcansoydas +author: John Snow Labs +name: bert_base_arabic_emotion_analysis_v3 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_arabic_emotion_analysis_v3` is a English model originally trained by alpcansoydas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_arabic_emotion_analysis_v3_en_5.5.0_3.0_1727291326769.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_arabic_emotion_analysis_v3_en_5.5.0_3.0_1727291326769.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_arabic_emotion_analysis_v3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_arabic_emotion_analysis_v3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_arabic_emotion_analysis_v3| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.2 MB| + +## References + +https://huggingface.co/alpcansoydas/bert-base-arabic-emotion-analysis-v3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_b2b_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_b2b_en.md new file mode 100644 index 00000000000000..4d06875aef47c4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_b2b_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_b2b BertForSequenceClassification from Egel +author: John Snow Labs +name: bert_base_b2b +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_b2b` is a English model originally trained by Egel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_b2b_en_5.5.0_3.0_1727272765484.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_b2b_en_5.5.0_3.0_1727272765484.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_b2b","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_b2b", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_b2b| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.7 MB| + +## References + +https://huggingface.co/Egel/bert-base-b2b \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_b2b_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_b2b_pipeline_en.md new file mode 100644 index 00000000000000..f65923067f6c61 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_b2b_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_b2b_pipeline pipeline BertForSequenceClassification from Egel +author: John Snow Labs +name: bert_base_b2b_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_b2b_pipeline` is a English model originally trained by Egel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_b2b_pipeline_en_5.5.0_3.0_1727272801300.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_b2b_pipeline_en_5.5.0_3.0_1727272801300.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_b2b_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_b2b_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_b2b_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|667.7 MB| + +## References + +https://huggingface.co/Egel/bert-base-b2b + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_a_grishman_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_a_grishman_en.md new file mode 100644 index 00000000000000..a4072063d844d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_a_grishman_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_banking77_pt2_a_grishman BertForSequenceClassification from a-grishman +author: John Snow Labs +name: bert_base_banking77_pt2_a_grishman +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_banking77_pt2_a_grishman` is a English model originally trained by a-grishman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_a_grishman_en_5.5.0_3.0_1727292218511.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_a_grishman_en_5.5.0_3.0_1727292218511.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_banking77_pt2_a_grishman","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_banking77_pt2_a_grishman", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_banking77_pt2_a_grishman| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/a-grishman/bert-base-banking77-pt2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_a_grishman_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_a_grishman_pipeline_en.md new file mode 100644 index 00000000000000..dcaf8cfd64d84d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_a_grishman_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_banking77_pt2_a_grishman_pipeline pipeline BertForSequenceClassification from a-grishman +author: John Snow Labs +name: bert_base_banking77_pt2_a_grishman_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_banking77_pt2_a_grishman_pipeline` is a English model originally trained by a-grishman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_a_grishman_pipeline_en_5.5.0_3.0_1727292240209.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_a_grishman_pipeline_en_5.5.0_3.0_1727292240209.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_banking77_pt2_a_grishman_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_banking77_pt2_a_grishman_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_banking77_pt2_a_grishman_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/a-grishman/bert-base-banking77-pt2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_andriydovgal_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_andriydovgal_pipeline_en.md new file mode 100644 index 00000000000000..46e0c1aa5dc7d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_andriydovgal_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_banking77_pt2_andriydovgal_pipeline pipeline BertForSequenceClassification from andriydovgal +author: John Snow Labs +name: bert_base_banking77_pt2_andriydovgal_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_banking77_pt2_andriydovgal_pipeline` is a English model originally trained by andriydovgal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_andriydovgal_pipeline_en_5.5.0_3.0_1727267289273.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_andriydovgal_pipeline_en_5.5.0_3.0_1727267289273.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_banking77_pt2_andriydovgal_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_banking77_pt2_andriydovgal_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_banking77_pt2_andriydovgal_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/andriydovgal/bert-base-banking77-pt2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_davinnnnn_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_davinnnnn_en.md new file mode 100644 index 00000000000000..77be3ac1591266 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_davinnnnn_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_banking77_pt2_davinnnnn BertForSequenceClassification from davinnnnn +author: John Snow Labs +name: bert_base_banking77_pt2_davinnnnn +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_banking77_pt2_davinnnnn` is a English model originally trained by davinnnnn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_davinnnnn_en_5.5.0_3.0_1727288231537.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_davinnnnn_en_5.5.0_3.0_1727288231537.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_banking77_pt2_davinnnnn","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_banking77_pt2_davinnnnn", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_banking77_pt2_davinnnnn| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/davinnnnn/bert-base-banking77-pt2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_davinnnnn_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_davinnnnn_pipeline_en.md new file mode 100644 index 00000000000000..83c384d462ab62 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_davinnnnn_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_banking77_pt2_davinnnnn_pipeline pipeline BertForSequenceClassification from davinnnnn +author: John Snow Labs +name: bert_base_banking77_pt2_davinnnnn_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_banking77_pt2_davinnnnn_pipeline` is a English model originally trained by davinnnnn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_davinnnnn_pipeline_en_5.5.0_3.0_1727288252110.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_davinnnnn_pipeline_en_5.5.0_3.0_1727288252110.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_banking77_pt2_davinnnnn_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_banking77_pt2_davinnnnn_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_banking77_pt2_davinnnnn_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/davinnnnn/bert-base-banking77-pt2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_gchauhan_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_gchauhan_en.md new file mode 100644 index 00000000000000..a3634bef3d71db --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_gchauhan_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_banking77_pt2_gchauhan BertForSequenceClassification from gchauhan +author: John Snow Labs +name: bert_base_banking77_pt2_gchauhan +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_banking77_pt2_gchauhan` is a English model originally trained by gchauhan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_gchauhan_en_5.5.0_3.0_1727306908353.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_gchauhan_en_5.5.0_3.0_1727306908353.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_banking77_pt2_gchauhan","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_banking77_pt2_gchauhan", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_banking77_pt2_gchauhan| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/gchauhan/bert-base-banking77-pt2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_gchauhan_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_gchauhan_pipeline_en.md new file mode 100644 index 00000000000000..62661003d66786 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_gchauhan_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_banking77_pt2_gchauhan_pipeline pipeline BertForSequenceClassification from gchauhan +author: John Snow Labs +name: bert_base_banking77_pt2_gchauhan_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_banking77_pt2_gchauhan_pipeline` is a English model originally trained by gchauhan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_gchauhan_pipeline_en_5.5.0_3.0_1727306929787.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_gchauhan_pipeline_en_5.5.0_3.0_1727306929787.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_banking77_pt2_gchauhan_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_banking77_pt2_gchauhan_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_banking77_pt2_gchauhan_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/gchauhan/bert-base-banking77-pt2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_lauritssn_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_lauritssn_pipeline_en.md new file mode 100644 index 00000000000000..ea429660b66e51 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_lauritssn_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_banking77_pt2_lauritssn_pipeline pipeline BertForSequenceClassification from lauritssn +author: John Snow Labs +name: bert_base_banking77_pt2_lauritssn_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_banking77_pt2_lauritssn_pipeline` is a English model originally trained by lauritssn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_lauritssn_pipeline_en_5.5.0_3.0_1727308741403.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_lauritssn_pipeline_en_5.5.0_3.0_1727308741403.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_banking77_pt2_lauritssn_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_banking77_pt2_lauritssn_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_banking77_pt2_lauritssn_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/lauritssn/bert-base-banking77-pt2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_tingli_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_tingli_en.md new file mode 100644 index 00000000000000..e8d5225b122b0e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_tingli_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_banking77_pt2_tingli BertForSequenceClassification from Tingli +author: John Snow Labs +name: bert_base_banking77_pt2_tingli +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_banking77_pt2_tingli` is a English model originally trained by Tingli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_tingli_en_5.5.0_3.0_1727289444596.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_tingli_en_5.5.0_3.0_1727289444596.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_banking77_pt2_tingli","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_banking77_pt2_tingli", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_banking77_pt2_tingli| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/Tingli/bert-base-banking77-pt2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_tingli_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_tingli_pipeline_en.md new file mode 100644 index 00000000000000..834deff63dcb37 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_tingli_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_banking77_pt2_tingli_pipeline pipeline BertForSequenceClassification from Tingli +author: John Snow Labs +name: bert_base_banking77_pt2_tingli_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_banking77_pt2_tingli_pipeline` is a English model originally trained by Tingli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_tingli_pipeline_en_5.5.0_3.0_1727289466196.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_tingli_pipeline_en_5.5.0_3.0_1727289466196.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_banking77_pt2_tingli_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_banking77_pt2_tingli_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_banking77_pt2_tingli_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/Tingli/bert-base-banking77-pt2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_tunggad_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_tunggad_en.md new file mode 100644 index 00000000000000..67895d9e0173a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_tunggad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_banking77_pt2_tunggad BertForSequenceClassification from tunggad +author: John Snow Labs +name: bert_base_banking77_pt2_tunggad +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_banking77_pt2_tunggad` is a English model originally trained by tunggad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_tunggad_en_5.5.0_3.0_1727308556866.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_tunggad_en_5.5.0_3.0_1727308556866.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_banking77_pt2_tunggad","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_banking77_pt2_tunggad", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_banking77_pt2_tunggad| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/tunggad/bert-base-banking77-pt2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_tunggad_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_tunggad_pipeline_en.md new file mode 100644 index 00000000000000..ba391612a81868 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_tunggad_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_banking77_pt2_tunggad_pipeline pipeline BertForSequenceClassification from tunggad +author: John Snow Labs +name: bert_base_banking77_pt2_tunggad_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_banking77_pt2_tunggad_pipeline` is a English model originally trained by tunggad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_tunggad_pipeline_en_5.5.0_3.0_1727308578440.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_tunggad_pipeline_en_5.5.0_3.0_1727308578440.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_banking77_pt2_tunggad_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_banking77_pt2_tunggad_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_banking77_pt2_tunggad_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/tunggad/bert-base-banking77-pt2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_vildgras_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_vildgras_pipeline_en.md new file mode 100644 index 00000000000000..7ac5b85b1bb3c3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_banking77_pt2_vildgras_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_banking77_pt2_vildgras_pipeline pipeline BertForSequenceClassification from vildgras +author: John Snow Labs +name: bert_base_banking77_pt2_vildgras_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_banking77_pt2_vildgras_pipeline` is a English model originally trained by vildgras. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_vildgras_pipeline_en_5.5.0_3.0_1727268064932.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_vildgras_pipeline_en_5.5.0_3.0_1727268064932.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_banking77_pt2_vildgras_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_banking77_pt2_vildgras_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_banking77_pt2_vildgras_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/vildgras/bert-base-banking77-pt2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_canadawildfire_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_canadawildfire_en.md new file mode 100644 index 00000000000000..447d9450f2c69e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_canadawildfire_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_canadawildfire BertForSequenceClassification from rizvi-rahil786 +author: John Snow Labs +name: bert_base_canadawildfire +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_canadawildfire` is a English model originally trained by rizvi-rahil786. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_canadawildfire_en_5.5.0_3.0_1727289174485.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_canadawildfire_en_5.5.0_3.0_1727289174485.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_canadawildfire","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_canadawildfire", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_canadawildfire| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/rizvi-rahil786/bert-base-canadaWildfire \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_canadawildfire_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_canadawildfire_pipeline_en.md new file mode 100644 index 00000000000000..7a9f8a0cf2a0a8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_canadawildfire_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_canadawildfire_pipeline pipeline BertForSequenceClassification from rizvi-rahil786 +author: John Snow Labs +name: bert_base_canadawildfire_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_canadawildfire_pipeline` is a English model originally trained by rizvi-rahil786. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_canadawildfire_pipeline_en_5.5.0_3.0_1727289196252.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_canadawildfire_pipeline_en_5.5.0_3.0_1727289196252.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_canadawildfire_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_canadawildfire_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_canadawildfire_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/rizvi-rahil786/bert-base-canadaWildfire + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_cased_english_sentweet_derogatory_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_cased_english_sentweet_derogatory_en.md new file mode 100644 index 00000000000000..1060b781887f31 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_cased_english_sentweet_derogatory_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_cased_english_sentweet_derogatory BertForSequenceClassification from jayanta +author: John Snow Labs +name: bert_base_cased_english_sentweet_derogatory +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_english_sentweet_derogatory` is a English model originally trained by jayanta. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_english_sentweet_derogatory_en_5.5.0_3.0_1727288770427.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_english_sentweet_derogatory_en_5.5.0_3.0_1727288770427.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_english_sentweet_derogatory","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_english_sentweet_derogatory", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_english_sentweet_derogatory| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/jayanta/bert-base-cased-english-sentweet-Derogatory \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_cased_finetuned_ner_bc2gm_iob_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_cased_finetuned_ner_bc2gm_iob_pipeline_en.md new file mode 100644 index 00000000000000..0ce81bd70dd56e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_cased_finetuned_ner_bc2gm_iob_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_cased_finetuned_ner_bc2gm_iob_pipeline pipeline BertForTokenClassification from DunnBC22 +author: John Snow Labs +name: bert_base_cased_finetuned_ner_bc2gm_iob_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_ner_bc2gm_iob_pipeline` is a English model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_ner_bc2gm_iob_pipeline_en_5.5.0_3.0_1727284132847.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_ner_bc2gm_iob_pipeline_en_5.5.0_3.0_1727284132847.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_cased_finetuned_ner_bc2gm_iob_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_cased_finetuned_ner_bc2gm_iob_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_ner_bc2gm_iob_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/DunnBC22/bert-base-cased-finetuned-ner-BC2GM-IOB + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_cased_finetuned_sst2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_cased_finetuned_sst2_pipeline_en.md new file mode 100644 index 00000000000000..abbb4ebc10a50d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_cased_finetuned_sst2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_cased_finetuned_sst2_pipeline pipeline BertForSequenceClassification from w05230505 +author: John Snow Labs +name: bert_base_cased_finetuned_sst2_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_sst2_pipeline` is a English model originally trained by w05230505. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_sst2_pipeline_en_5.5.0_3.0_1727291239831.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_sst2_pipeline_en_5.5.0_3.0_1727291239831.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_cased_finetuned_sst2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_cased_finetuned_sst2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_sst2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/w05230505/bert-base-cased-finetuned-sst2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_cased_ft5_3ep_s42_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_cased_ft5_3ep_s42_en.md new file mode 100644 index 00000000000000..cc5664c04d6f64 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_cased_ft5_3ep_s42_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_cased_ft5_3ep_s42 BertForSequenceClassification from AbhishekkV19 +author: John Snow Labs +name: bert_base_cased_ft5_3ep_s42 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_ft5_3ep_s42` is a English model originally trained by AbhishekkV19. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_ft5_3ep_s42_en_5.5.0_3.0_1727285680003.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_ft5_3ep_s42_en_5.5.0_3.0_1727285680003.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_ft5_3ep_s42","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_ft5_3ep_s42", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_ft5_3ep_s42| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/AbhishekkV19/bert-base-cased-ft5-3ep-s42 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_cased_ft6_3ep_s42_2_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_cased_ft6_3ep_s42_2_en.md new file mode 100644 index 00000000000000..fffdf026a5380d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_cased_ft6_3ep_s42_2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_cased_ft6_3ep_s42_2 BertForSequenceClassification from AbhishekkV19 +author: John Snow Labs +name: bert_base_cased_ft6_3ep_s42_2 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_ft6_3ep_s42_2` is a English model originally trained by AbhishekkV19. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_ft6_3ep_s42_2_en_5.5.0_3.0_1727290264273.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_ft6_3ep_s42_2_en_5.5.0_3.0_1727290264273.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_ft6_3ep_s42_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_ft6_3ep_s42_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_ft6_3ep_s42_2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/AbhishekkV19/bert-base-cased-ft6-3ep-s42-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_chinese_climate_risk_opportunity_prediction_2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_chinese_climate_risk_opportunity_prediction_2_pipeline_en.md new file mode 100644 index 00000000000000..86131efc192141 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_chinese_climate_risk_opportunity_prediction_2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_chinese_climate_risk_opportunity_prediction_2_pipeline pipeline BertForSequenceClassification from hw2942 +author: John Snow Labs +name: bert_base_chinese_climate_risk_opportunity_prediction_2_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_climate_risk_opportunity_prediction_2_pipeline` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_climate_risk_opportunity_prediction_2_pipeline_en_5.5.0_3.0_1727290322812.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_climate_risk_opportunity_prediction_2_pipeline_en_5.5.0_3.0_1727290322812.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_chinese_climate_risk_opportunity_prediction_2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_chinese_climate_risk_opportunity_prediction_2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_climate_risk_opportunity_prediction_2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|383.3 MB| + +## References + +https://huggingface.co/hw2942/bert-base-chinese-climate-risk-opportunity-prediction-2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_chinese_text_classification_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_chinese_text_classification_pipeline_en.md new file mode 100644 index 00000000000000..cd18a144556922 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_chinese_text_classification_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_chinese_text_classification_pipeline pipeline BertForSequenceClassification from CeroShrijver +author: John Snow Labs +name: bert_base_chinese_text_classification_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_text_classification_pipeline` is a English model originally trained by CeroShrijver. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_text_classification_pipeline_en_5.5.0_3.0_1727307152582.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_text_classification_pipeline_en_5.5.0_3.0_1727307152582.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_chinese_text_classification_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_chinese_text_classification_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_text_classification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|383.3 MB| + +## References + +https://huggingface.co/CeroShrijver/bert-base-chinese-text-classification + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_finetuned_code_classification_mid_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_finetuned_code_classification_mid_en.md new file mode 100644 index 00000000000000..d960a74691d8cf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_finetuned_code_classification_mid_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_finetuned_code_classification_mid BertForSequenceClassification from JUNstats +author: John Snow Labs +name: bert_base_finetuned_code_classification_mid +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_finetuned_code_classification_mid` is a English model originally trained by JUNstats. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_code_classification_mid_en_5.5.0_3.0_1727286142226.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_code_classification_mid_en_5.5.0_3.0_1727286142226.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_finetuned_code_classification_mid","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_finetuned_code_classification_mid", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_finetuned_code_classification_mid| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.8 MB| + +## References + +https://huggingface.co/JUNstats/bert-base-finetuned-code-classification-mid \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_finetuned_ynat_marip_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_finetuned_ynat_marip_en.md new file mode 100644 index 00000000000000..f1e2b0727d2f4d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_finetuned_ynat_marip_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_finetuned_ynat_marip BertForSequenceClassification from marip +author: John Snow Labs +name: bert_base_finetuned_ynat_marip +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_finetuned_ynat_marip` is a English model originally trained by marip. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_ynat_marip_en_5.5.0_3.0_1727291189975.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_ynat_marip_en_5.5.0_3.0_1727291189975.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_finetuned_ynat_marip","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_finetuned_ynat_marip", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_finetuned_ynat_marip| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.7 MB| + +## References + +https://huggingface.co/marip/bert-base-finetuned-ynat \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_finetuned_ynat_marip_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_finetuned_ynat_marip_pipeline_en.md new file mode 100644 index 00000000000000..593498674998ad --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_finetuned_ynat_marip_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_finetuned_ynat_marip_pipeline pipeline BertForSequenceClassification from marip +author: John Snow Labs +name: bert_base_finetuned_ynat_marip_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_finetuned_ynat_marip_pipeline` is a English model originally trained by marip. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_ynat_marip_pipeline_en_5.5.0_3.0_1727291212269.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_ynat_marip_pipeline_en_5.5.0_3.0_1727291212269.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_finetuned_ynat_marip_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_finetuned_ynat_marip_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_finetuned_ynat_marip_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|414.7 MB| + +## References + +https://huggingface.co/marip/bert-base-finetuned-ynat + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_german_cased_finetuned_subj_pretrained_with_noisydata_v2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_german_cased_finetuned_subj_pretrained_with_noisydata_v2_pipeline_en.md new file mode 100644 index 00000000000000..87c1388bebc5ca --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_german_cased_finetuned_subj_pretrained_with_noisydata_v2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_german_cased_finetuned_subj_pretrained_with_noisydata_v2_pipeline pipeline BertForTokenClassification from tbosse +author: John Snow Labs +name: bert_base_german_cased_finetuned_subj_pretrained_with_noisydata_v2_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_german_cased_finetuned_subj_pretrained_with_noisydata_v2_pipeline` is a English model originally trained by tbosse. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_german_cased_finetuned_subj_pretrained_with_noisydata_v2_pipeline_en_5.5.0_3.0_1727260524315.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_german_cased_finetuned_subj_pretrained_with_noisydata_v2_pipeline_en_5.5.0_3.0_1727260524315.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_german_cased_finetuned_subj_pretrained_with_noisydata_v2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_german_cased_finetuned_subj_pretrained_with_noisydata_v2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_german_cased_finetuned_subj_pretrained_with_noisydata_v2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.9 MB| + +## References + +https://huggingface.co/tbosse/bert-base-german-cased-finetuned-subj_preTrained_with_noisyData_v2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_german_cased_finetuned_subj_v6_7epoch_v3_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_german_cased_finetuned_subj_v6_7epoch_v3_pipeline_en.md new file mode 100644 index 00000000000000..bc2653017d2f38 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_german_cased_finetuned_subj_v6_7epoch_v3_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_german_cased_finetuned_subj_v6_7epoch_v3_pipeline pipeline BertForTokenClassification from tbosse +author: John Snow Labs +name: bert_base_german_cased_finetuned_subj_v6_7epoch_v3_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_german_cased_finetuned_subj_v6_7epoch_v3_pipeline` is a English model originally trained by tbosse. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_german_cased_finetuned_subj_v6_7epoch_v3_pipeline_en_5.5.0_3.0_1727284316272.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_german_cased_finetuned_subj_v6_7epoch_v3_pipeline_en_5.5.0_3.0_1727284316272.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_german_cased_finetuned_subj_v6_7epoch_v3_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_german_cased_finetuned_subj_v6_7epoch_v3_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_german_cased_finetuned_subj_v6_7epoch_v3_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.9 MB| + +## References + +https://huggingface.co/tbosse/bert-base-german-cased-finetuned-subj_v6_7Epoch_v3 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_massive_intent_48_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_massive_intent_48_en.md new file mode 100644 index 00000000000000..f0c1584c632850 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_massive_intent_48_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_massive_intent_48 BertForSequenceClassification from gokuls +author: John Snow Labs +name: bert_base_massive_intent_48 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_massive_intent_48` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_massive_intent_48_en_5.5.0_3.0_1727288253168.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_massive_intent_48_en_5.5.0_3.0_1727288253168.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_massive_intent_48","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_massive_intent_48", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_massive_intent_48| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.4 MB| + +## References + +https://huggingface.co/gokuls/bert-base-Massive-intent_48 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_massive_intent_48_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_massive_intent_48_pipeline_en.md new file mode 100644 index 00000000000000..b24eaea6502448 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_massive_intent_48_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_massive_intent_48_pipeline pipeline BertForSequenceClassification from gokuls +author: John Snow Labs +name: bert_base_massive_intent_48_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_massive_intent_48_pipeline` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_massive_intent_48_pipeline_en_5.5.0_3.0_1727288277495.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_massive_intent_48_pipeline_en_5.5.0_3.0_1727288277495.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_massive_intent_48_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_massive_intent_48_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_massive_intent_48_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|410.4 MB| + +## References + +https://huggingface.co/gokuls/bert-base-Massive-intent_48 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_massive_intent_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_massive_intent_en.md new file mode 100644 index 00000000000000..b568a1066c59c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_massive_intent_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_massive_intent BertForSequenceClassification from gokuls +author: John Snow Labs +name: bert_base_massive_intent +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_massive_intent` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_massive_intent_en_5.5.0_3.0_1727273184565.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_massive_intent_en_5.5.0_3.0_1727273184565.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_massive_intent","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_massive_intent", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_massive_intent| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/gokuls/bert-base-Massive-intent \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_msmarco_fiqa_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_msmarco_fiqa_pipeline_en.md new file mode 100644 index 00000000000000..4f19312118f6ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_msmarco_fiqa_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_msmarco_fiqa_pipeline pipeline BertForSequenceClassification from vittoriomaggio +author: John Snow Labs +name: bert_base_msmarco_fiqa_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_msmarco_fiqa_pipeline` is a English model originally trained by vittoriomaggio. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_msmarco_fiqa_pipeline_en_5.5.0_3.0_1727273492580.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_msmarco_fiqa_pipeline_en_5.5.0_3.0_1727273492580.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_msmarco_fiqa_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_msmarco_fiqa_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_msmarco_fiqa_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/vittoriomaggio/bert-base-msmarco-fiqa + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_finetuned_emotion_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_finetuned_emotion_pipeline_xx.md new file mode 100644 index 00000000000000..f88df482a5ea9f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_finetuned_emotion_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_finetuned_emotion_pipeline pipeline BertForSequenceClassification from m8than +author: John Snow Labs +name: bert_base_multilingual_cased_finetuned_emotion_pipeline +date: 2024-09-25 +tags: [xx, open_source, pipeline, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_finetuned_emotion_pipeline` is a Multilingual model originally trained by m8than. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_emotion_pipeline_xx_5.5.0_3.0_1727269300599.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_emotion_pipeline_xx_5.5.0_3.0_1727269300599.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_multilingual_cased_finetuned_emotion_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_multilingual_cased_finetuned_emotion_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_finetuned_emotion_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/m8than/bert-base-multilingual-cased-finetuned-emotion + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_finetuned_meia_analisisdesentimientos_emiliam_xx.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_finetuned_meia_analisisdesentimientos_emiliam_xx.md new file mode 100644 index 00000000000000..7d6637e8df22eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_finetuned_meia_analisisdesentimientos_emiliam_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_finetuned_meia_analisisdesentimientos_emiliam BertForSequenceClassification from emiliam +author: John Snow Labs +name: bert_base_multilingual_cased_finetuned_meia_analisisdesentimientos_emiliam +date: 2024-09-25 +tags: [xx, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_finetuned_meia_analisisdesentimientos_emiliam` is a Multilingual model originally trained by emiliam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_meia_analisisdesentimientos_emiliam_xx_5.5.0_3.0_1727290539397.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_meia_analisisdesentimientos_emiliam_xx_5.5.0_3.0_1727290539397.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_finetuned_meia_analisisdesentimientos_emiliam","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_finetuned_meia_analisisdesentimientos_emiliam", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_finetuned_meia_analisisdesentimientos_emiliam| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/emiliam/bert-base-multilingual-cased-finetuned-MeIA-AnalisisDeSentimientos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_finetuned_meia_analisisdesentimientos_kevinid_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_finetuned_meia_analisisdesentimientos_kevinid_pipeline_xx.md new file mode 100644 index 00000000000000..65df992b04914f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_finetuned_meia_analisisdesentimientos_kevinid_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_finetuned_meia_analisisdesentimientos_kevinid_pipeline pipeline BertForSequenceClassification from kevinid +author: John Snow Labs +name: bert_base_multilingual_cased_finetuned_meia_analisisdesentimientos_kevinid_pipeline +date: 2024-09-25 +tags: [xx, open_source, pipeline, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_finetuned_meia_analisisdesentimientos_kevinid_pipeline` is a Multilingual model originally trained by kevinid. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_meia_analisisdesentimientos_kevinid_pipeline_xx_5.5.0_3.0_1727290983365.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_meia_analisisdesentimientos_kevinid_pipeline_xx_5.5.0_3.0_1727290983365.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_multilingual_cased_finetuned_meia_analisisdesentimientos_kevinid_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_multilingual_cased_finetuned_meia_analisisdesentimientos_kevinid_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_finetuned_meia_analisisdesentimientos_kevinid_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/kevinid/bert-base-multilingual-cased-finetuned-MeIA-AnalisisDeSentimientos + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_finetuned_meia_analisislobosolitario_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_finetuned_meia_analisislobosolitario_pipeline_xx.md new file mode 100644 index 00000000000000..9773f5036cebfa --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_finetuned_meia_analisislobosolitario_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_finetuned_meia_analisislobosolitario_pipeline pipeline BertForSequenceClassification from Sam12111 +author: John Snow Labs +name: bert_base_multilingual_cased_finetuned_meia_analisislobosolitario_pipeline +date: 2024-09-25 +tags: [xx, open_source, pipeline, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_finetuned_meia_analisislobosolitario_pipeline` is a Multilingual model originally trained by Sam12111. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_meia_analisislobosolitario_pipeline_xx_5.5.0_3.0_1727286332125.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_meia_analisislobosolitario_pipeline_xx_5.5.0_3.0_1727286332125.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_multilingual_cased_finetuned_meia_analisislobosolitario_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_multilingual_cased_finetuned_meia_analisislobosolitario_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_finetuned_meia_analisislobosolitario_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/Sam12111/bert-base-multilingual-cased-finetuned-MeIA-AnalisisLoboSolitario + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_finetuned_meia_analisislobosolitario_xx.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_finetuned_meia_analisislobosolitario_xx.md new file mode 100644 index 00000000000000..4dfaeeca7b215b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_finetuned_meia_analisislobosolitario_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_finetuned_meia_analisislobosolitario BertForSequenceClassification from Sam12111 +author: John Snow Labs +name: bert_base_multilingual_cased_finetuned_meia_analisislobosolitario +date: 2024-09-25 +tags: [xx, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_finetuned_meia_analisislobosolitario` is a Multilingual model originally trained by Sam12111. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_meia_analisislobosolitario_xx_5.5.0_3.0_1727286297990.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_meia_analisislobosolitario_xx_5.5.0_3.0_1727286297990.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_finetuned_meia_analisislobosolitario","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_finetuned_meia_analisislobosolitario", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_finetuned_meia_analisislobosolitario| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/Sam12111/bert-base-multilingual-cased-finetuned-MeIA-AnalisisLoboSolitario \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_finetuned_ner_geocorpus_xx.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_finetuned_ner_geocorpus_xx.md new file mode 100644 index 00000000000000..6ca0e9ef1355fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_finetuned_ner_geocorpus_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_finetuned_ner_geocorpus BertForTokenClassification from GuiTap +author: John Snow Labs +name: bert_base_multilingual_cased_finetuned_ner_geocorpus +date: 2024-09-25 +tags: [xx, open_source, onnx, token_classification, bert, ner] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_finetuned_ner_geocorpus` is a Multilingual model originally trained by GuiTap. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_ner_geocorpus_xx_5.5.0_3.0_1727249867478.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_ner_geocorpus_xx_5.5.0_3.0_1727249867478.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_multilingual_cased_finetuned_ner_geocorpus","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_base_multilingual_cased_finetuned_ner_geocorpus", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_finetuned_ner_geocorpus| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|665.1 MB| + +## References + +https://huggingface.co/GuiTap/bert-base-multilingual-cased-finetuned-ner-geocorpus \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_qnli_10_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_qnli_10_pipeline_xx.md new file mode 100644 index 00000000000000..7a9c61a84b17d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_qnli_10_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_qnli_10_pipeline pipeline BertForSequenceClassification from tmnam20 +author: John Snow Labs +name: bert_base_multilingual_cased_qnli_10_pipeline +date: 2024-09-25 +tags: [xx, open_source, pipeline, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_qnli_10_pipeline` is a Multilingual model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_qnli_10_pipeline_xx_5.5.0_3.0_1727289889723.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_qnli_10_pipeline_xx_5.5.0_3.0_1727289889723.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_multilingual_cased_qnli_10_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_multilingual_cased_qnli_10_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_qnli_10_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/tmnam20/bert-base-multilingual-cased-qnli-10 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_qnli_10_xx.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_qnli_10_xx.md new file mode 100644 index 00000000000000..27d8480bfe46b1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_qnli_10_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_qnli_10 BertForSequenceClassification from tmnam20 +author: John Snow Labs +name: bert_base_multilingual_cased_qnli_10 +date: 2024-09-25 +tags: [xx, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_qnli_10` is a Multilingual model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_qnli_10_xx_5.5.0_3.0_1727289851537.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_qnli_10_xx_5.5.0_3.0_1727289851537.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_qnli_10","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_qnli_10", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_qnli_10| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/tmnam20/bert-base-multilingual-cased-qnli-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_sst2_1_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_sst2_1_pipeline_xx.md new file mode 100644 index 00000000000000..dc64670757cd0a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_sst2_1_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_sst2_1_pipeline pipeline BertForSequenceClassification from tmnam20 +author: John Snow Labs +name: bert_base_multilingual_cased_sst2_1_pipeline +date: 2024-09-25 +tags: [xx, open_source, pipeline, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_sst2_1_pipeline` is a Multilingual model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_sst2_1_pipeline_xx_5.5.0_3.0_1727287077447.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_sst2_1_pipeline_xx_5.5.0_3.0_1727287077447.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_multilingual_cased_sst2_1_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_multilingual_cased_sst2_1_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_sst2_1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/tmnam20/bert-base-multilingual-cased-sst2-1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_sst2_1_xx.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_sst2_1_xx.md new file mode 100644 index 00000000000000..ec0253503c041f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_sst2_1_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_sst2_1 BertForSequenceClassification from tmnam20 +author: John Snow Labs +name: bert_base_multilingual_cased_sst2_1 +date: 2024-09-25 +tags: [xx, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_sst2_1` is a Multilingual model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_sst2_1_xx_5.5.0_3.0_1727287042684.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_sst2_1_xx_5.5.0_3.0_1727287042684.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_sst2_1","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_sst2_1", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_sst2_1| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/tmnam20/bert-base-multilingual-cased-sst2-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_vtoc_1_xx.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_vtoc_1_xx.md new file mode 100644 index 00000000000000..fc5c0897349f6e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_cased_vtoc_1_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_vtoc_1 BertForSequenceClassification from tmnam20 +author: John Snow Labs +name: bert_base_multilingual_cased_vtoc_1 +date: 2024-09-25 +tags: [xx, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_vtoc_1` is a Multilingual model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_vtoc_1_xx_5.5.0_3.0_1727289104383.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_vtoc_1_xx_5.5.0_3.0_1727289104383.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_vtoc_1","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_vtoc_1", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_vtoc_1| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/tmnam20/bert-base-multilingual-cased-vtoc-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_uncased_akazi_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_uncased_akazi_pipeline_xx.md new file mode 100644 index 00000000000000..c7e11715b9e833 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_uncased_akazi_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_uncased_akazi_pipeline pipeline BertForSequenceClassification from Akazi +author: John Snow Labs +name: bert_base_multilingual_uncased_akazi_pipeline +date: 2024-09-25 +tags: [xx, open_source, pipeline, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_uncased_akazi_pipeline` is a Multilingual model originally trained by Akazi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_uncased_akazi_pipeline_xx_5.5.0_3.0_1727284563668.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_uncased_akazi_pipeline_xx_5.5.0_3.0_1727284563668.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_multilingual_uncased_akazi_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_multilingual_uncased_akazi_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_uncased_akazi_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|627.8 MB| + +## References + +https://huggingface.co/Akazi/bert-base-multilingual-uncased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_uncased_sentiment_eternaut_xx.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_uncased_sentiment_eternaut_xx.md new file mode 100644 index 00000000000000..9d23eaaaf8fafb --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_multilingual_uncased_sentiment_eternaut_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_uncased_sentiment_eternaut BertForSequenceClassification from eternaut +author: John Snow Labs +name: bert_base_multilingual_uncased_sentiment_eternaut +date: 2024-09-25 +tags: [xx, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_uncased_sentiment_eternaut` is a Multilingual model originally trained by eternaut. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_uncased_sentiment_eternaut_xx_5.5.0_3.0_1727290286188.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_uncased_sentiment_eternaut_xx_5.5.0_3.0_1727290286188.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_uncased_sentiment_eternaut","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_uncased_sentiment_eternaut", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_uncased_sentiment_eternaut| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|627.7 MB| + +## References + +https://huggingface.co/eternaut/bert-base-multilingual-uncased-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_nlp100_title_classification_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_nlp100_title_classification_pipeline_en.md new file mode 100644 index 00000000000000..c8afd9cbdcf76a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_nlp100_title_classification_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_nlp100_title_classification_pipeline pipeline BertForSequenceClassification from udaizin +author: John Snow Labs +name: bert_base_nlp100_title_classification_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_nlp100_title_classification_pipeline` is a English model originally trained by udaizin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_nlp100_title_classification_pipeline_en_5.5.0_3.0_1727268209093.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_nlp100_title_classification_pipeline_en_5.5.0_3.0_1727268209093.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_nlp100_title_classification_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_nlp100_title_classification_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_nlp100_title_classification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/udaizin/bert-base-nlp100_title_classification + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_portuguese_cased_hatebr_pipeline_pt.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_portuguese_cased_hatebr_pipeline_pt.md new file mode 100644 index 00000000000000..01ffcfc692e1df --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_portuguese_cased_hatebr_pipeline_pt.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Portuguese bert_base_portuguese_cased_hatebr_pipeline pipeline BertForSequenceClassification from ruanchaves +author: John Snow Labs +name: bert_base_portuguese_cased_hatebr_pipeline +date: 2024-09-25 +tags: [pt, open_source, pipeline, onnx] +task: Text Classification +language: pt +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_portuguese_cased_hatebr_pipeline` is a Portuguese model originally trained by ruanchaves. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_portuguese_cased_hatebr_pipeline_pt_5.5.0_3.0_1727293413743.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_portuguese_cased_hatebr_pipeline_pt_5.5.0_3.0_1727293413743.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_portuguese_cased_hatebr_pipeline", lang = "pt") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_portuguese_cased_hatebr_pipeline", lang = "pt") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_portuguese_cased_hatebr_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|pt| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ruanchaves/bert-base-portuguese-cased-hatebr + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_portuguese_cased_hatebr_pt.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_portuguese_cased_hatebr_pt.md new file mode 100644 index 00000000000000..96c5e44e86d518 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_portuguese_cased_hatebr_pt.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Portuguese bert_base_portuguese_cased_hatebr BertForSequenceClassification from ruanchaves +author: John Snow Labs +name: bert_base_portuguese_cased_hatebr +date: 2024-09-25 +tags: [pt, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: pt +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_portuguese_cased_hatebr` is a Portuguese model originally trained by ruanchaves. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_portuguese_cased_hatebr_pt_5.5.0_3.0_1727293392007.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_portuguese_cased_hatebr_pt_5.5.0_3.0_1727293392007.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_portuguese_cased_hatebr","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_portuguese_cased_hatebr", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_portuguese_cased_hatebr| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|pt| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ruanchaves/bert-base-portuguese-cased-hatebr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_sentiment_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_sentiment_en.md new file mode 100644 index 00000000000000..019dd39f4976ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_sentiment_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_sentiment BertForSequenceClassification from 51la5 +author: John Snow Labs +name: bert_base_sentiment +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_sentiment` is a English model originally trained by 51la5. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_sentiment_en_5.5.0_3.0_1727291398488.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_sentiment_en_5.5.0_3.0_1727291398488.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_sentiment","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_sentiment", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_sentiment| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/51la5/bert-base-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_sentiment_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_sentiment_pipeline_en.md new file mode 100644 index 00000000000000..6c1daaababcaaa --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_sentiment_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_sentiment_pipeline pipeline BertForSequenceClassification from 51la5 +author: John Snow Labs +name: bert_base_sentiment_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_sentiment_pipeline` is a English model originally trained by 51la5. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_sentiment_pipeline_en_5.5.0_3.0_1727291419363.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_sentiment_pipeline_en_5.5.0_3.0_1727291419363.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_sentiment_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_sentiment_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_sentiment_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/51la5/bert-base-sentiment + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_spanish_wwm_cased_k3_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_spanish_wwm_cased_k3_en.md new file mode 100644 index 00000000000000..9e7b07f82f4086 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_spanish_wwm_cased_k3_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_spanish_wwm_cased_k3 BertForSequenceClassification from dtorber +author: John Snow Labs +name: bert_base_spanish_wwm_cased_k3 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_spanish_wwm_cased_k3` is a English model originally trained by dtorber. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_cased_k3_en_5.5.0_3.0_1727292308209.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_cased_k3_en_5.5.0_3.0_1727292308209.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_spanish_wwm_cased_k3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_spanish_wwm_cased_k3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_spanish_wwm_cased_k3| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|411.7 MB| + +## References + +https://huggingface.co/dtorber/bert-base-spanish-wwm-cased_K3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_spanish_wwm_cased_k3_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_spanish_wwm_cased_k3_pipeline_en.md new file mode 100644 index 00000000000000..20e6576d001f90 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_spanish_wwm_cased_k3_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_spanish_wwm_cased_k3_pipeline pipeline BertForSequenceClassification from dtorber +author: John Snow Labs +name: bert_base_spanish_wwm_cased_k3_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_spanish_wwm_cased_k3_pipeline` is a English model originally trained by dtorber. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_cased_k3_pipeline_en_5.5.0_3.0_1727292329578.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_cased_k3_pipeline_en_5.5.0_3.0_1727292329578.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_spanish_wwm_cased_k3_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_spanish_wwm_cased_k3_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_spanish_wwm_cased_k3_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|411.7 MB| + +## References + +https://huggingface.co/dtorber/bert-base-spanish-wwm-cased_K3 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_spanish_wwm_uncased_f_tag_0_3_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_spanish_wwm_uncased_f_tag_0_3_en.md new file mode 100644 index 00000000000000..e096a2bec92594 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_spanish_wwm_uncased_f_tag_0_3_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_spanish_wwm_uncased_f_tag_0_3 BertForSequenceClassification from ISA-Group +author: John Snow Labs +name: bert_base_spanish_wwm_uncased_f_tag_0_3 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_spanish_wwm_uncased_f_tag_0_3` is a English model originally trained by ISA-Group. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_uncased_f_tag_0_3_en_5.5.0_3.0_1727289452265.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_uncased_f_tag_0_3_en_5.5.0_3.0_1727289452265.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_spanish_wwm_uncased_f_tag_0_3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_spanish_wwm_uncased_f_tag_0_3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_spanish_wwm_uncased_f_tag_0_3| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|411.9 MB| + +## References + +https://huggingface.co/ISA-Group/bert-base-spanish-wwm-uncased_f-tag-0.3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_spanish_wwm_uncased_f_tag_0_3_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_spanish_wwm_uncased_f_tag_0_3_pipeline_en.md new file mode 100644 index 00000000000000..3ae14b2ab140f7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_spanish_wwm_uncased_f_tag_0_3_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_spanish_wwm_uncased_f_tag_0_3_pipeline pipeline BertForSequenceClassification from ISA-Group +author: John Snow Labs +name: bert_base_spanish_wwm_uncased_f_tag_0_3_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_spanish_wwm_uncased_f_tag_0_3_pipeline` is a English model originally trained by ISA-Group. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_uncased_f_tag_0_3_pipeline_en_5.5.0_3.0_1727289474102.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_uncased_f_tag_0_3_pipeline_en_5.5.0_3.0_1727289474102.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_spanish_wwm_uncased_f_tag_0_3_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_spanish_wwm_uncased_f_tag_0_3_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_spanish_wwm_uncased_f_tag_0_3_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|411.9 MB| + +## References + +https://huggingface.co/ISA-Group/bert-base-spanish-wwm-uncased_f-tag-0.3 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_spanish_wwm_uncased_r_tag_0_2_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_spanish_wwm_uncased_r_tag_0_2_en.md new file mode 100644 index 00000000000000..25ab24b1de31d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_spanish_wwm_uncased_r_tag_0_2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_spanish_wwm_uncased_r_tag_0_2 BertForSequenceClassification from ISA-Group +author: John Snow Labs +name: bert_base_spanish_wwm_uncased_r_tag_0_2 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_spanish_wwm_uncased_r_tag_0_2` is a English model originally trained by ISA-Group. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_uncased_r_tag_0_2_en_5.5.0_3.0_1727290462173.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_uncased_r_tag_0_2_en_5.5.0_3.0_1727290462173.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_spanish_wwm_uncased_r_tag_0_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_spanish_wwm_uncased_r_tag_0_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_spanish_wwm_uncased_r_tag_0_2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|411.9 MB| + +## References + +https://huggingface.co/ISA-Group/bert-base-spanish-wwm-uncased_r-tag-0.2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_spanish_wwm_uncased_r_tag_0_2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_spanish_wwm_uncased_r_tag_0_2_pipeline_en.md new file mode 100644 index 00000000000000..7f237f2c493ef5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_spanish_wwm_uncased_r_tag_0_2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_spanish_wwm_uncased_r_tag_0_2_pipeline pipeline BertForSequenceClassification from ISA-Group +author: John Snow Labs +name: bert_base_spanish_wwm_uncased_r_tag_0_2_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_spanish_wwm_uncased_r_tag_0_2_pipeline` is a English model originally trained by ISA-Group. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_uncased_r_tag_0_2_pipeline_en_5.5.0_3.0_1727290483362.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_uncased_r_tag_0_2_pipeline_en_5.5.0_3.0_1727290483362.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_spanish_wwm_uncased_r_tag_0_2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_spanish_wwm_uncased_r_tag_0_2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_spanish_wwm_uncased_r_tag_0_2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|411.9 MB| + +## References + +https://huggingface.co/ISA-Group/bert-base-spanish-wwm-uncased_r-tag-0.2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_sst2_gokuls_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_sst2_gokuls_en.md new file mode 100644 index 00000000000000..a107e510e47560 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_sst2_gokuls_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_sst2_gokuls BertForSequenceClassification from gokuls +author: John Snow Labs +name: bert_base_sst2_gokuls +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_sst2_gokuls` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_sst2_gokuls_en_5.5.0_3.0_1727290604390.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_sst2_gokuls_en_5.5.0_3.0_1727290604390.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_sst2_gokuls","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_sst2_gokuls", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_sst2_gokuls| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/gokuls/bert-base-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_turkish_128k_cased_offensive_pipeline_tr.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_turkish_128k_cased_offensive_pipeline_tr.md new file mode 100644 index 00000000000000..d4ecddaec1d3ac --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_turkish_128k_cased_offensive_pipeline_tr.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Turkish bert_base_turkish_128k_cased_offensive_pipeline pipeline BertForSequenceClassification from Overfit-GM +author: John Snow Labs +name: bert_base_turkish_128k_cased_offensive_pipeline +date: 2024-09-25 +tags: [tr, open_source, pipeline, onnx] +task: Text Classification +language: tr +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_turkish_128k_cased_offensive_pipeline` is a Turkish model originally trained by Overfit-GM. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_turkish_128k_cased_offensive_pipeline_tr_5.5.0_3.0_1727307289460.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_turkish_128k_cased_offensive_pipeline_tr_5.5.0_3.0_1727307289460.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_turkish_128k_cased_offensive_pipeline", lang = "tr") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_turkish_128k_cased_offensive_pipeline", lang = "tr") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_turkish_128k_cased_offensive_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|tr| +|Size:|691.2 MB| + +## References + +https://huggingface.co/Overfit-GM/bert-base-turkish-128k-cased-offensive + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_turkish_128k_cased_offensive_tr.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_turkish_128k_cased_offensive_tr.md new file mode 100644 index 00000000000000..a0c6ac260809ca --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_turkish_128k_cased_offensive_tr.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Turkish bert_base_turkish_128k_cased_offensive BertForSequenceClassification from Overfit-GM +author: John Snow Labs +name: bert_base_turkish_128k_cased_offensive +date: 2024-09-25 +tags: [tr, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: tr +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_turkish_128k_cased_offensive` is a Turkish model originally trained by Overfit-GM. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_turkish_128k_cased_offensive_tr_5.5.0_3.0_1727307248982.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_turkish_128k_cased_offensive_tr_5.5.0_3.0_1727307248982.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_turkish_128k_cased_offensive","tr") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_turkish_128k_cased_offensive", "tr") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_turkish_128k_cased_offensive| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|tr| +|Size:|691.1 MB| + +## References + +https://huggingface.co/Overfit-GM/bert-base-turkish-128k-cased-offensive \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_8_50_0_01_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_8_50_0_01_pipeline_en.md new file mode 100644 index 00000000000000..f4f0e972966a9b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_8_50_0_01_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_8_50_0_01_pipeline pipeline BertForSequenceClassification from daisyxie21 +author: John Snow Labs +name: bert_base_uncased_8_50_0_01_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_8_50_0_01_pipeline` is a English model originally trained by daisyxie21. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_8_50_0_01_pipeline_en_5.5.0_3.0_1727276697325.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_8_50_0_01_pipeline_en_5.5.0_3.0_1727276697325.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_8_50_0_01_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_8_50_0_01_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_8_50_0_01_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.4 MB| + +## References + +https://huggingface.co/daisyxie21/bert-base-uncased-8-50-0.01 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_ad_nonad_classifer_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_ad_nonad_classifer_pipeline_en.md new file mode 100644 index 00000000000000..495c14923c6c93 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_ad_nonad_classifer_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_ad_nonad_classifer_pipeline pipeline BertForSequenceClassification from Kaleemullah +author: John Snow Labs +name: bert_base_uncased_ad_nonad_classifer_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_ad_nonad_classifer_pipeline` is a English model originally trained by Kaleemullah. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_ad_nonad_classifer_pipeline_en_5.5.0_3.0_1727285275710.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_ad_nonad_classifer_pipeline_en_5.5.0_3.0_1727285275710.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_ad_nonad_classifer_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_ad_nonad_classifer_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_ad_nonad_classifer_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Kaleemullah/bert-base-uncased-ad-nonad-classifer + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_emotion_pysentimiento_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_emotion_pysentimiento_en.md new file mode 100644 index 00000000000000..5632d19f778046 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_emotion_pysentimiento_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_emotion_pysentimiento BertForSequenceClassification from pysentimiento +author: John Snow Labs +name: bert_base_uncased_emotion_pysentimiento +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_emotion_pysentimiento` is a English model originally trained by pysentimiento. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_emotion_pysentimiento_en_5.5.0_3.0_1727306375747.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_emotion_pysentimiento_en_5.5.0_3.0_1727306375747.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_emotion_pysentimiento","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_emotion_pysentimiento", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_emotion_pysentimiento| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/pysentimiento/bert-base-uncased-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_emotion_ricocheh_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_emotion_ricocheh_pipeline_en.md new file mode 100644 index 00000000000000..c3df5abddbef64 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_emotion_ricocheh_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_emotion_ricocheh_pipeline pipeline BertForSequenceClassification from RicoCHEH +author: John Snow Labs +name: bert_base_uncased_emotion_ricocheh_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_emotion_ricocheh_pipeline` is a English model originally trained by RicoCHEH. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_emotion_ricocheh_pipeline_en_5.5.0_3.0_1727308442335.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_emotion_ricocheh_pipeline_en_5.5.0_3.0_1727308442335.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_emotion_ricocheh_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_emotion_ricocheh_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_emotion_ricocheh_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/RicoCHEH/bert-base-uncased-emotion + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_emotion_v1_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_emotion_v1_en.md new file mode 100644 index 00000000000000..616db57a955ac1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_emotion_v1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_emotion_v1 BertForSequenceClassification from Cesar42 +author: John Snow Labs +name: bert_base_uncased_emotion_v1 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_emotion_v1` is a English model originally trained by Cesar42. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_emotion_v1_en_5.5.0_3.0_1727279789936.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_emotion_v1_en_5.5.0_3.0_1727279789936.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_emotion_v1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_emotion_v1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_emotion_v1| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Cesar42/bert-base-uncased-emotion_v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_amazon_reviews_multi_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_amazon_reviews_multi_pipeline_en.md new file mode 100644 index 00000000000000..e6c3e2ee600f22 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_amazon_reviews_multi_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_amazon_reviews_multi_pipeline pipeline BertForSequenceClassification from JoelVIU +author: John Snow Labs +name: bert_base_uncased_finetuned_amazon_reviews_multi_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_amazon_reviews_multi_pipeline` is a English model originally trained by JoelVIU. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_amazon_reviews_multi_pipeline_en_5.5.0_3.0_1727286302655.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_amazon_reviews_multi_pipeline_en_5.5.0_3.0_1727286302655.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_amazon_reviews_multi_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_amazon_reviews_multi_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_amazon_reviews_multi_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/JoelVIU/bert-base-uncased-finetuned-amazon_reviews_multi + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cda_gender_neutral_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cda_gender_neutral_pipeline_en.md new file mode 100644 index 00000000000000..ad94055f949ac4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cda_gender_neutral_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cda_gender_neutral_pipeline pipeline BertEmbeddings from zz990906 +author: John Snow Labs +name: bert_base_uncased_finetuned_cda_gender_neutral_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cda_gender_neutral_pipeline` is a English model originally trained by zz990906. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cda_gender_neutral_pipeline_en_5.5.0_3.0_1727232590873.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cda_gender_neutral_pipeline_en_5.5.0_3.0_1727232590873.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_cda_gender_neutral_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_cda_gender_neutral_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cda_gender_neutral_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/zz990906/bert-base-uncased-finetuned-cda-gender-neutral + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_clause_type_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_clause_type_pipeline_en.md new file mode 100644 index 00000000000000..6711f2d63a8b38 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_clause_type_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_clause_type_pipeline pipeline BertForSequenceClassification from mauro +author: John Snow Labs +name: bert_base_uncased_finetuned_clause_type_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_clause_type_pipeline` is a English model originally trained by mauro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_clause_type_pipeline_en_5.5.0_3.0_1727307408276.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_clause_type_pipeline_en_5.5.0_3.0_1727307408276.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_clause_type_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_clause_type_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_clause_type_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/mauro/bert-base-uncased-finetuned-clause-type + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_clinc_oos_nikitakapitan_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_clinc_oos_nikitakapitan_en.md new file mode 100644 index 00000000000000..3f1828f0cbfff3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_clinc_oos_nikitakapitan_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_clinc_oos_nikitakapitan BertForSequenceClassification from nikitakapitan +author: John Snow Labs +name: bert_base_uncased_finetuned_clinc_oos_nikitakapitan +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_clinc_oos_nikitakapitan` is a English model originally trained by nikitakapitan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_clinc_oos_nikitakapitan_en_5.5.0_3.0_1727306197895.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_clinc_oos_nikitakapitan_en_5.5.0_3.0_1727306197895.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_clinc_oos_nikitakapitan","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_clinc_oos_nikitakapitan", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_clinc_oos_nikitakapitan| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/nikitakapitan/bert-base-uncased-finetuned-clinc_oos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_clinc_oos_nikitakapitan_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_clinc_oos_nikitakapitan_pipeline_en.md new file mode 100644 index 00000000000000..87063695fbff30 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_clinc_oos_nikitakapitan_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_clinc_oos_nikitakapitan_pipeline pipeline BertForSequenceClassification from nikitakapitan +author: John Snow Labs +name: bert_base_uncased_finetuned_clinc_oos_nikitakapitan_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_clinc_oos_nikitakapitan_pipeline` is a English model originally trained by nikitakapitan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_clinc_oos_nikitakapitan_pipeline_en_5.5.0_3.0_1727306222785.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_clinc_oos_nikitakapitan_pipeline_en_5.5.0_3.0_1727306222785.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_clinc_oos_nikitakapitan_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_clinc_oos_nikitakapitan_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_clinc_oos_nikitakapitan_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/nikitakapitan/bert-base-uncased-finetuned-clinc_oos + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_avb_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_avb_en.md new file mode 100644 index 00000000000000..9494654d15a123 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_avb_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_avb BertForSequenceClassification from avb +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_avb +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_avb` is a English model originally trained by avb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_avb_en_5.5.0_3.0_1727268554800.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_avb_en_5.5.0_3.0_1727268554800.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_avb","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_avb", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_avb| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/avb/bert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_hw2_sepehr_bakhshi_dropout_00_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_hw2_sepehr_bakhshi_dropout_00_en.md new file mode 100644 index 00000000000000..ab94780aafb99f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_hw2_sepehr_bakhshi_dropout_00_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_hw2_sepehr_bakhshi_dropout_00 BertForSequenceClassification from sepehrbakhshi +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_hw2_sepehr_bakhshi_dropout_00 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_hw2_sepehr_bakhshi_dropout_00` is a English model originally trained by sepehrbakhshi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_hw2_sepehr_bakhshi_dropout_00_en_5.5.0_3.0_1727289622011.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_hw2_sepehr_bakhshi_dropout_00_en_5.5.0_3.0_1727289622011.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_hw2_sepehr_bakhshi_dropout_00","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_hw2_sepehr_bakhshi_dropout_00", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_hw2_sepehr_bakhshi_dropout_00| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/sepehrbakhshi/bert-base-uncased-finetuned-cola_HW2_sepehr_bakhshi_dropout_00 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_hw2_sepehr_bakhshi_dropout_00_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_hw2_sepehr_bakhshi_dropout_00_pipeline_en.md new file mode 100644 index 00000000000000..50f6f0cd51bddc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_hw2_sepehr_bakhshi_dropout_00_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_hw2_sepehr_bakhshi_dropout_00_pipeline pipeline BertForSequenceClassification from sepehrbakhshi +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_hw2_sepehr_bakhshi_dropout_00_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_hw2_sepehr_bakhshi_dropout_00_pipeline` is a English model originally trained by sepehrbakhshi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_hw2_sepehr_bakhshi_dropout_00_pipeline_en_5.5.0_3.0_1727289643692.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_hw2_sepehr_bakhshi_dropout_00_pipeline_en_5.5.0_3.0_1727289643692.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_cola_hw2_sepehr_bakhshi_dropout_00_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_cola_hw2_sepehr_bakhshi_dropout_00_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_hw2_sepehr_bakhshi_dropout_00_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/sepehrbakhshi/bert-base-uncased-finetuned-cola_HW2_sepehr_bakhshi_dropout_00 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_kaanha_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_kaanha_pipeline_en.md new file mode 100644 index 00000000000000..eb58bed700e2e7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_kaanha_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_kaanha_pipeline pipeline BertForSequenceClassification from KaanHa +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_kaanha_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_kaanha_pipeline` is a English model originally trained by KaanHa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_kaanha_pipeline_en_5.5.0_3.0_1727287154428.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_kaanha_pipeline_en_5.5.0_3.0_1727287154428.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_cola_kaanha_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_cola_kaanha_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_kaanha_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/KaanHa/bert-base-uncased-finetuned-cola + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_learning_rate_2e_05_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_learning_rate_2e_05_pipeline_en.md new file mode 100644 index 00000000000000..ee40536cbec49f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_learning_rate_2e_05_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_learning_rate_2e_05_pipeline pipeline BertForSequenceClassification from cansurav +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_learning_rate_2e_05_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_learning_rate_2e_05_pipeline` is a English model originally trained by cansurav. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_learning_rate_2e_05_pipeline_en_5.5.0_3.0_1727286410801.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_learning_rate_2e_05_pipeline_en_5.5.0_3.0_1727286410801.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_cola_learning_rate_2e_05_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_cola_learning_rate_2e_05_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_learning_rate_2e_05_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/cansurav/bert-base-uncased-finetuned-cola-learning_rate-2e-05 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_mofyrt_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_mofyrt_en.md new file mode 100644 index 00000000000000..adb77bea383f49 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_mofyrt_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_mofyrt BertForSequenceClassification from mofyrt +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_mofyrt +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_mofyrt` is a English model originally trained by mofyrt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_mofyrt_en_5.5.0_3.0_1727285863460.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_mofyrt_en_5.5.0_3.0_1727285863460.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_mofyrt","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_mofyrt", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_mofyrt| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/mofyrt/bert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_mofyrt_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_mofyrt_pipeline_en.md new file mode 100644 index 00000000000000..a47612b1b75a4a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_mofyrt_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_mofyrt_pipeline pipeline BertForSequenceClassification from mofyrt +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_mofyrt_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_mofyrt_pipeline` is a English model originally trained by mofyrt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_mofyrt_pipeline_en_5.5.0_3.0_1727285885672.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_mofyrt_pipeline_en_5.5.0_3.0_1727285885672.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_cola_mofyrt_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_cola_mofyrt_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_mofyrt_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/mofyrt/bert-base-uncased-finetuned-cola + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_orcan_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_orcan_en.md new file mode 100644 index 00000000000000..58f018a3c2568b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_orcan_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_orcan BertForSequenceClassification from orcan +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_orcan +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_orcan` is a English model originally trained by orcan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_orcan_en_5.5.0_3.0_1727286173002.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_orcan_en_5.5.0_3.0_1727286173002.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_orcan","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_orcan", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_orcan| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/orcan/bert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_orcan_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_orcan_pipeline_en.md new file mode 100644 index 00000000000000..3f7aea1fc114d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_orcan_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_orcan_pipeline pipeline BertForSequenceClassification from orcan +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_orcan_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_orcan_pipeline` is a English model originally trained by orcan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_orcan_pipeline_en_5.5.0_3.0_1727286196829.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_orcan_pipeline_en_5.5.0_3.0_1727286196829.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_cola_orcan_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_cola_orcan_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_orcan_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/orcan/bert-base-uncased-finetuned-cola + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_senihylmz_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_senihylmz_en.md new file mode 100644 index 00000000000000..e80640fc3b58e0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_senihylmz_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_senihylmz BertForSequenceClassification from senihylmz +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_senihylmz +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_senihylmz` is a English model originally trained by senihylmz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_senihylmz_en_5.5.0_3.0_1727285373213.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_senihylmz_en_5.5.0_3.0_1727285373213.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_senihylmz","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_senihylmz", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_senihylmz| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/senihylmz/bert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_sepehr_final_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_sepehr_final_en.md new file mode 100644 index 00000000000000..76c00d09314246 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_sepehr_final_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_sepehr_final BertForSequenceClassification from sepehrbakhshi +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_sepehr_final +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_sepehr_final` is a English model originally trained by sepehrbakhshi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_sepehr_final_en_5.5.0_3.0_1727288826333.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_sepehr_final_en_5.5.0_3.0_1727288826333.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_sepehr_final","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_sepehr_final", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_sepehr_final| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/sepehrbakhshi/bert-base-uncased-finetuned-cola_sepehr_final \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_sepehr_sepehr_sepehr_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_sepehr_sepehr_sepehr_pipeline_en.md new file mode 100644 index 00000000000000..39de9f89ef0d5a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_cola_sepehr_sepehr_sepehr_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_sepehr_sepehr_sepehr_pipeline pipeline BertForSequenceClassification from sepehrbakhshi +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_sepehr_sepehr_sepehr_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_sepehr_sepehr_sepehr_pipeline` is a English model originally trained by sepehrbakhshi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_sepehr_sepehr_sepehr_pipeline_en_5.5.0_3.0_1727288748952.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_sepehr_sepehr_sepehr_pipeline_en_5.5.0_3.0_1727288748952.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_cola_sepehr_sepehr_sepehr_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_cola_sepehr_sepehr_sepehr_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_sepehr_sepehr_sepehr_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/sepehrbakhshi/bert-base-uncased-finetuned-cola_sepehr_sepehr_sepehr + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_depression_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_depression_en.md new file mode 100644 index 00000000000000..a6c5ae1d87f959 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_depression_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_depression BertForSequenceClassification from welsachy +author: John Snow Labs +name: bert_base_uncased_finetuned_depression +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_depression` is a English model originally trained by welsachy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_depression_en_5.5.0_3.0_1727276739655.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_depression_en_5.5.0_3.0_1727276739655.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_depression","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_depression", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_depression| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/welsachy/bert-base-uncased-finetuned-depression \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_emotion_nikitakapitan_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_emotion_nikitakapitan_en.md new file mode 100644 index 00000000000000..7e26de71308613 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_emotion_nikitakapitan_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_emotion_nikitakapitan BertForSequenceClassification from nikitakapitan +author: John Snow Labs +name: bert_base_uncased_finetuned_emotion_nikitakapitan +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_emotion_nikitakapitan` is a English model originally trained by nikitakapitan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_emotion_nikitakapitan_en_5.5.0_3.0_1727302550473.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_emotion_nikitakapitan_en_5.5.0_3.0_1727302550473.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_emotion_nikitakapitan","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_emotion_nikitakapitan", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_emotion_nikitakapitan| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/nikitakapitan/bert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_filtered_0602_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_filtered_0602_en.md new file mode 100644 index 00000000000000..b61b2633cada75 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_filtered_0602_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_filtered_0602 BertForSequenceClassification from YeRyeongLee +author: John Snow Labs +name: bert_base_uncased_finetuned_filtered_0602 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_filtered_0602` is a English model originally trained by YeRyeongLee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_filtered_0602_en_5.5.0_3.0_1727284691216.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_filtered_0602_en_5.5.0_3.0_1727284691216.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_filtered_0602","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_filtered_0602", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_filtered_0602| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/YeRyeongLee/bert-base-uncased-finetuned-filtered-0602 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_mnli_max_length_256_epoch_6_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_mnli_max_length_256_epoch_6_en.md new file mode 100644 index 00000000000000..16d3098e4244b8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_mnli_max_length_256_epoch_6_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_mnli_max_length_256_epoch_6 BertForSequenceClassification from yy642 +author: John Snow Labs +name: bert_base_uncased_finetuned_mnli_max_length_256_epoch_6 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_mnli_max_length_256_epoch_6` is a English model originally trained by yy642. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_mnli_max_length_256_epoch_6_en_5.5.0_3.0_1727287571878.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_mnli_max_length_256_epoch_6_en_5.5.0_3.0_1727287571878.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_mnli_max_length_256_epoch_6","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_mnli_max_length_256_epoch_6", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_mnli_max_length_256_epoch_6| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/yy642/bert-base-uncased-finetuned-mnli-max-length-256-epoch-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_mnli_max_length_256_epoch_6_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_mnli_max_length_256_epoch_6_pipeline_en.md new file mode 100644 index 00000000000000..860ef94672bf9c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_mnli_max_length_256_epoch_6_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_mnli_max_length_256_epoch_6_pipeline pipeline BertForSequenceClassification from yy642 +author: John Snow Labs +name: bert_base_uncased_finetuned_mnli_max_length_256_epoch_6_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_mnli_max_length_256_epoch_6_pipeline` is a English model originally trained by yy642. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_mnli_max_length_256_epoch_6_pipeline_en_5.5.0_3.0_1727287593103.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_mnli_max_length_256_epoch_6_pipeline_en_5.5.0_3.0_1727287593103.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_mnli_max_length_256_epoch_6_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_mnli_max_length_256_epoch_6_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_mnli_max_length_256_epoch_6_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/yy642/bert-base-uncased-finetuned-mnli-max-length-256-epoch-6 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_mrpc_vitaliivrublevskyi_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_mrpc_vitaliivrublevskyi_en.md new file mode 100644 index 00000000000000..9b274cf8d8fc3e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_mrpc_vitaliivrublevskyi_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_mrpc_vitaliivrublevskyi BertForSequenceClassification from VitaliiVrublevskyi +author: John Snow Labs +name: bert_base_uncased_finetuned_mrpc_vitaliivrublevskyi +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_mrpc_vitaliivrublevskyi` is a English model originally trained by VitaliiVrublevskyi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_mrpc_vitaliivrublevskyi_en_5.5.0_3.0_1727288570620.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_mrpc_vitaliivrublevskyi_en_5.5.0_3.0_1727288570620.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_mrpc_vitaliivrublevskyi","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_mrpc_vitaliivrublevskyi", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_mrpc_vitaliivrublevskyi| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/VitaliiVrublevskyi/bert-base-uncased-finetuned-mrpc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_poli_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_poli_en.md new file mode 100644 index 00000000000000..11accc2bfde2a5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_poli_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_poli BertForSequenceClassification from lmajer +author: John Snow Labs +name: bert_base_uncased_finetuned_poli +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_poli` is a English model originally trained by lmajer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_poli_en_5.5.0_3.0_1727284926701.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_poli_en_5.5.0_3.0_1727284926701.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_poli","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_poli", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_poli| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/lmajer/bert-base-uncased-finetuned-POLI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_qnli_anamelchor_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_qnli_anamelchor_en.md new file mode 100644 index 00000000000000..32b083386aad8b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_qnli_anamelchor_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_qnli_anamelchor BertForSequenceClassification from anamelchor +author: John Snow Labs +name: bert_base_uncased_finetuned_qnli_anamelchor +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_qnli_anamelchor` is a English model originally trained by anamelchor. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_qnli_anamelchor_en_5.5.0_3.0_1727286825080.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_qnli_anamelchor_en_5.5.0_3.0_1727286825080.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_qnli_anamelchor","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_qnli_anamelchor", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_qnli_anamelchor| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/anamelchor/bert-base-uncased-finetuned-qnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_qnli_anamelchor_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_qnli_anamelchor_pipeline_en.md new file mode 100644 index 00000000000000..804ced33a1016a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_qnli_anamelchor_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_qnli_anamelchor_pipeline pipeline BertForSequenceClassification from anamelchor +author: John Snow Labs +name: bert_base_uncased_finetuned_qnli_anamelchor_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_qnli_anamelchor_pipeline` is a English model originally trained by anamelchor. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_qnli_anamelchor_pipeline_en_5.5.0_3.0_1727286846283.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_qnli_anamelchor_pipeline_en_5.5.0_3.0_1727286846283.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_qnli_anamelchor_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_qnli_anamelchor_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_qnli_anamelchor_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/anamelchor/bert-base-uncased-finetuned-qnli + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_rte_max_length_512_epoch_5_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_rte_max_length_512_epoch_5_en.md new file mode 100644 index 00000000000000..e285732c8a48a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_rte_max_length_512_epoch_5_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_rte_max_length_512_epoch_5 BertForSequenceClassification from yy642 +author: John Snow Labs +name: bert_base_uncased_finetuned_rte_max_length_512_epoch_5 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_rte_max_length_512_epoch_5` is a English model originally trained by yy642. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_rte_max_length_512_epoch_5_en_5.5.0_3.0_1727286788718.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_rte_max_length_512_epoch_5_en_5.5.0_3.0_1727286788718.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_rte_max_length_512_epoch_5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_rte_max_length_512_epoch_5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_rte_max_length_512_epoch_5| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/yy642/bert-base-uncased-finetuned-rte-max-length-512-epoch-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_sst2_codeinjax_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_sst2_codeinjax_pipeline_en.md new file mode 100644 index 00000000000000..3208f81438bb4b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_sst2_codeinjax_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_sst2_codeinjax_pipeline pipeline BertForSequenceClassification from CodeinJax +author: John Snow Labs +name: bert_base_uncased_finetuned_sst2_codeinjax_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_sst2_codeinjax_pipeline` is a English model originally trained by CodeinJax. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sst2_codeinjax_pipeline_en_5.5.0_3.0_1727301056112.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sst2_codeinjax_pipeline_en_5.5.0_3.0_1727301056112.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_sst2_codeinjax_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_sst2_codeinjax_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_sst2_codeinjax_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/CodeinJax/bert-base-uncased-finetuned-sst2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_sst2_finetuned_sst2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_sst2_finetuned_sst2_pipeline_en.md new file mode 100644 index 00000000000000..955cab3f9ae792 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_sst2_finetuned_sst2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_sst2_finetuned_sst2_pipeline pipeline BertForSequenceClassification from sasuke +author: John Snow Labs +name: bert_base_uncased_finetuned_sst2_finetuned_sst2_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_sst2_finetuned_sst2_pipeline` is a English model originally trained by sasuke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sst2_finetuned_sst2_pipeline_en_5.5.0_3.0_1727279625215.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sst2_finetuned_sst2_pipeline_en_5.5.0_3.0_1727279625215.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_sst2_finetuned_sst2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_sst2_finetuned_sst2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_sst2_finetuned_sst2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/sasuke/bert-base-uncased-finetuned-sst2-finetuned-sst2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_sst2_sst2_membership_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_sst2_sst2_membership_pipeline_en.md new file mode 100644 index 00000000000000..f177e0a6ccfcab --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_sst2_sst2_membership_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_sst2_sst2_membership_pipeline pipeline BertForSequenceClassification from doyoungkim +author: John Snow Labs +name: bert_base_uncased_finetuned_sst2_sst2_membership_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_sst2_sst2_membership_pipeline` is a English model originally trained by doyoungkim. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sst2_sst2_membership_pipeline_en_5.5.0_3.0_1727307026098.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sst2_sst2_membership_pipeline_en_5.5.0_3.0_1727307026098.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_sst2_sst2_membership_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_sst2_sst2_membership_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_sst2_sst2_membership_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/doyoungkim/bert-base-uncased-finetuned-sst2-sst2-membership + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_vedantgaur_human_generated_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_vedantgaur_human_generated_en.md new file mode 100644 index 00000000000000..e3a0cf8bf96454 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_vedantgaur_human_generated_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_vedantgaur_human_generated BertForSequenceClassification from SkwarczynskiP +author: John Snow Labs +name: bert_base_uncased_finetuned_vedantgaur_human_generated +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_vedantgaur_human_generated` is a English model originally trained by SkwarczynskiP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_vedantgaur_human_generated_en_5.5.0_3.0_1727297555919.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_vedantgaur_human_generated_en_5.5.0_3.0_1727297555919.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_vedantgaur_human_generated","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_vedantgaur_human_generated", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_vedantgaur_human_generated| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/SkwarczynskiP/bert-base-uncased-finetuned-vedantgaur-human-generated \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_vedantgaur_human_generated_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_vedantgaur_human_generated_pipeline_en.md new file mode 100644 index 00000000000000..a292e4af6ce51d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_finetuned_vedantgaur_human_generated_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_vedantgaur_human_generated_pipeline pipeline BertForSequenceClassification from SkwarczynskiP +author: John Snow Labs +name: bert_base_uncased_finetuned_vedantgaur_human_generated_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_vedantgaur_human_generated_pipeline` is a English model originally trained by SkwarczynskiP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_vedantgaur_human_generated_pipeline_en_5.5.0_3.0_1727297579157.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_vedantgaur_human_generated_pipeline_en_5.5.0_3.0_1727297579157.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_vedantgaur_human_generated_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_vedantgaur_human_generated_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_vedantgaur_human_generated_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/SkwarczynskiP/bert-base-uncased-finetuned-vedantgaur-human-generated + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_glue_mrpc_manasip25_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_glue_mrpc_manasip25_en.md new file mode 100644 index 00000000000000..52e72b636be7db --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_glue_mrpc_manasip25_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_glue_mrpc_manasip25 BertForSequenceClassification from manasip25 +author: John Snow Labs +name: bert_base_uncased_glue_mrpc_manasip25 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_glue_mrpc_manasip25` is a English model originally trained by manasip25. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_glue_mrpc_manasip25_en_5.5.0_3.0_1727306619070.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_glue_mrpc_manasip25_en_5.5.0_3.0_1727306619070.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_glue_mrpc_manasip25","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_glue_mrpc_manasip25", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_glue_mrpc_manasip25| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/manasip25/bert-base-uncased-glue-mrpc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_goemotions_original_finetuned_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_goemotions_original_finetuned_pipeline_en.md new file mode 100644 index 00000000000000..5f689575454f5e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_goemotions_original_finetuned_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_goemotions_original_finetuned_pipeline pipeline BertForSequenceClassification from justin871030 +author: John Snow Labs +name: bert_base_uncased_goemotions_original_finetuned_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_goemotions_original_finetuned_pipeline` is a English model originally trained by justin871030. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_goemotions_original_finetuned_pipeline_en_5.5.0_3.0_1727256813335.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_goemotions_original_finetuned_pipeline_en_5.5.0_3.0_1727256813335.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_goemotions_original_finetuned_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_goemotions_original_finetuned_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_goemotions_original_finetuned_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/justin871030/bert-base-uncased-goemotions-original-finetuned + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_hate_offensive_normal_speech_lr_2e_05_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_hate_offensive_normal_speech_lr_2e_05_en.md new file mode 100644 index 00000000000000..f4fb32ca5a7801 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_hate_offensive_normal_speech_lr_2e_05_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_hate_offensive_normal_speech_lr_2e_05 BertForSequenceClassification from DrishtiSharma +author: John Snow Labs +name: bert_base_uncased_hate_offensive_normal_speech_lr_2e_05 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_hate_offensive_normal_speech_lr_2e_05` is a English model originally trained by DrishtiSharma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_hate_offensive_normal_speech_lr_2e_05_en_5.5.0_3.0_1727307733548.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_hate_offensive_normal_speech_lr_2e_05_en_5.5.0_3.0_1727307733548.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_hate_offensive_normal_speech_lr_2e_05","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_hate_offensive_normal_speech_lr_2e_05", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_hate_offensive_normal_speech_lr_2e_05| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/DrishtiSharma/bert-base-uncased-hate-offensive-normal-speech-lr-2e-05 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_header_plus_content_textsim_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_header_plus_content_textsim_pipeline_en.md new file mode 100644 index 00000000000000..ba93556796e09d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_header_plus_content_textsim_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_header_plus_content_textsim_pipeline pipeline BertForSequenceClassification from kaanakdeniz +author: John Snow Labs +name: bert_base_uncased_header_plus_content_textsim_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_header_plus_content_textsim_pipeline` is a English model originally trained by kaanakdeniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_header_plus_content_textsim_pipeline_en_5.5.0_3.0_1727286529238.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_header_plus_content_textsim_pipeline_en_5.5.0_3.0_1727286529238.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_header_plus_content_textsim_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_header_plus_content_textsim_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_header_plus_content_textsim_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/kaanakdeniz/bert_base_uncased_header_plus_content_textsim + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_hoax_classifier_fulltext_1h2r_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_hoax_classifier_fulltext_1h2r_en.md new file mode 100644 index 00000000000000..06b04e008bb137 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_hoax_classifier_fulltext_1h2r_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_hoax_classifier_fulltext_1h2r BertForSequenceClassification from research-dump +author: John Snow Labs +name: bert_base_uncased_hoax_classifier_fulltext_1h2r +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_hoax_classifier_fulltext_1h2r` is a English model originally trained by research-dump. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_hoax_classifier_fulltext_1h2r_en_5.5.0_3.0_1727285877380.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_hoax_classifier_fulltext_1h2r_en_5.5.0_3.0_1727285877380.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_hoax_classifier_fulltext_1h2r","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_hoax_classifier_fulltext_1h2r", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_hoax_classifier_fulltext_1h2r| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/research-dump/bert-base-uncased_hoax_classifier_fulltext_1h2r \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_imdb_saved_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_imdb_saved_en.md new file mode 100644 index 00000000000000..b18cace77fe17a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_imdb_saved_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_imdb_saved BertForSequenceClassification from thaile +author: John Snow Labs +name: bert_base_uncased_imdb_saved +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_imdb_saved` is a English model originally trained by thaile. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_imdb_saved_en_5.5.0_3.0_1727266471958.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_imdb_saved_en_5.5.0_3.0_1727266471958.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_imdb_saved","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_imdb_saved", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_imdb_saved| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/thaile/bert-base-uncased-imdb-saved \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_issues_128_anantonios9_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_issues_128_anantonios9_pipeline_en.md new file mode 100644 index 00000000000000..2988f9bb1a6d61 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_issues_128_anantonios9_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_issues_128_anantonios9_pipeline pipeline BertEmbeddings from anantonios9 +author: John Snow Labs +name: bert_base_uncased_issues_128_anantonios9_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_issues_128_anantonios9_pipeline` is a English model originally trained by anantonios9. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_anantonios9_pipeline_en_5.5.0_3.0_1727255974098.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_issues_128_anantonios9_pipeline_en_5.5.0_3.0_1727255974098.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_issues_128_anantonios9_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_issues_128_anantonios9_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_issues_128_anantonios9_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/anantonios9/bert-base-uncased-issues-128 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_malayalam_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_malayalam_en.md new file mode 100644 index 00000000000000..f0f896e84fe9a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_malayalam_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_malayalam BertEmbeddings from Tural +author: John Snow Labs +name: bert_base_uncased_malayalam +date: 2024-09-25 +tags: [en, open_source, onnx, embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_malayalam` is a English model originally trained by Tural. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_malayalam_en_5.5.0_3.0_1727232977924.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_malayalam_en_5.5.0_3.0_1727232977924.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = BertEmbeddings.pretrained("bert_base_uncased_malayalam","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = BertEmbeddings.pretrained("bert_base_uncased_malayalam","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_malayalam| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[bert]| +|Language:|en| +|Size:|407.9 MB| + +## References + +https://huggingface.co/Tural/bert-base-uncased-ml \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_mrpc_epochs_10_lr_5e_05_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_mrpc_epochs_10_lr_5e_05_en.md new file mode 100644 index 00000000000000..dbc70f33ee87d7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_mrpc_epochs_10_lr_5e_05_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_mrpc_epochs_10_lr_5e_05 BertForSequenceClassification from prateeky2806 +author: John Snow Labs +name: bert_base_uncased_mrpc_epochs_10_lr_5e_05 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_mrpc_epochs_10_lr_5e_05` is a English model originally trained by prateeky2806. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mrpc_epochs_10_lr_5e_05_en_5.5.0_3.0_1727307085988.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mrpc_epochs_10_lr_5e_05_en_5.5.0_3.0_1727307085988.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_mrpc_epochs_10_lr_5e_05","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_mrpc_epochs_10_lr_5e_05", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_mrpc_epochs_10_lr_5e_05| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/prateeky2806/bert-base-uncased-mrpc-epochs-10-lr-5e-05 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_mrpc_epochs_10_lr_5e_05_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_mrpc_epochs_10_lr_5e_05_pipeline_en.md new file mode 100644 index 00000000000000..f84ac7401d18e7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_mrpc_epochs_10_lr_5e_05_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_mrpc_epochs_10_lr_5e_05_pipeline pipeline BertForSequenceClassification from prateeky2806 +author: John Snow Labs +name: bert_base_uncased_mrpc_epochs_10_lr_5e_05_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_mrpc_epochs_10_lr_5e_05_pipeline` is a English model originally trained by prateeky2806. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mrpc_epochs_10_lr_5e_05_pipeline_en_5.5.0_3.0_1727307107522.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mrpc_epochs_10_lr_5e_05_pipeline_en_5.5.0_3.0_1727307107522.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_mrpc_epochs_10_lr_5e_05_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_mrpc_epochs_10_lr_5e_05_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_mrpc_epochs_10_lr_5e_05_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/prateeky2806/bert-base-uncased-mrpc-epochs-10-lr-5e-05 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_mrpc_from_bert_large_uncased_mrpc_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_mrpc_from_bert_large_uncased_mrpc_pipeline_en.md new file mode 100644 index 00000000000000..7c7807b7cfcf0b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_mrpc_from_bert_large_uncased_mrpc_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_mrpc_from_bert_large_uncased_mrpc_pipeline pipeline BertForSequenceClassification from yoshitomo-matsubara +author: John Snow Labs +name: bert_base_uncased_mrpc_from_bert_large_uncased_mrpc_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_mrpc_from_bert_large_uncased_mrpc_pipeline` is a English model originally trained by yoshitomo-matsubara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mrpc_from_bert_large_uncased_mrpc_pipeline_en_5.5.0_3.0_1727308319152.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mrpc_from_bert_large_uncased_mrpc_pipeline_en_5.5.0_3.0_1727308319152.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_mrpc_from_bert_large_uncased_mrpc_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_mrpc_from_bert_large_uncased_mrpc_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_mrpc_from_bert_large_uncased_mrpc_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/yoshitomo-matsubara/bert-base-uncased-mrpc_from_bert-large-uncased-mrpc + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_news_ft_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_news_ft_en.md new file mode 100644 index 00000000000000..76439d6b2bb6a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_news_ft_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_news_ft BertForSequenceClassification from GTsky +author: John Snow Labs +name: bert_base_uncased_news_ft +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_news_ft` is a English model originally trained by GTsky. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_news_ft_en_5.5.0_3.0_1727307345511.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_news_ft_en_5.5.0_3.0_1727307345511.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_news_ft","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_news_ft", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_news_ft| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/GTsky/bert-base-uncased_news_ft \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_offenseval2019_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_offenseval2019_en.md new file mode 100644 index 00000000000000..bb11ec190a502c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_offenseval2019_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_offenseval2019 BertForSequenceClassification from mohsenfayyaz +author: John Snow Labs +name: bert_base_uncased_offenseval2019 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_offenseval2019` is a English model originally trained by mohsenfayyaz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_offenseval2019_en_5.5.0_3.0_1727291810242.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_offenseval2019_en_5.5.0_3.0_1727291810242.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_offenseval2019","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_offenseval2019", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_offenseval2019| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/mohsenfayyaz/bert-base-uncased-offenseval2019 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_offenseval2019_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_offenseval2019_pipeline_en.md new file mode 100644 index 00000000000000..c625ef04aeeeae --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_offenseval2019_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_offenseval2019_pipeline pipeline BertForSequenceClassification from mohsenfayyaz +author: John Snow Labs +name: bert_base_uncased_offenseval2019_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_offenseval2019_pipeline` is a English model originally trained by mohsenfayyaz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_offenseval2019_pipeline_en_5.5.0_3.0_1727291831994.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_offenseval2019_pipeline_en_5.5.0_3.0_1727291831994.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_offenseval2019_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_offenseval2019_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_offenseval2019_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/mohsenfayyaz/bert-base-uncased-offenseval2019 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_optuna_finetuned_cola_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_optuna_finetuned_cola_en.md new file mode 100644 index 00000000000000..635e63305bc318 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_optuna_finetuned_cola_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_optuna_finetuned_cola BertForSequenceClassification from ga21902298 +author: John Snow Labs +name: bert_base_uncased_optuna_finetuned_cola +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_optuna_finetuned_cola` is a English model originally trained by ga21902298. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_optuna_finetuned_cola_en_5.5.0_3.0_1727290901458.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_optuna_finetuned_cola_en_5.5.0_3.0_1727290901458.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_optuna_finetuned_cola","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_optuna_finetuned_cola", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_optuna_finetuned_cola| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ga21902298/bert-base-uncased-optuna-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_optuna_finetuned_cola_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_optuna_finetuned_cola_pipeline_en.md new file mode 100644 index 00000000000000..dbbfa7e05a8f98 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_optuna_finetuned_cola_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_optuna_finetuned_cola_pipeline pipeline BertForSequenceClassification from ga21902298 +author: John Snow Labs +name: bert_base_uncased_optuna_finetuned_cola_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_optuna_finetuned_cola_pipeline` is a English model originally trained by ga21902298. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_optuna_finetuned_cola_pipeline_en_5.5.0_3.0_1727290927633.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_optuna_finetuned_cola_pipeline_en_5.5.0_3.0_1727290927633.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_optuna_finetuned_cola_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_optuna_finetuned_cola_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_optuna_finetuned_cola_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/ga21902298/bert-base-uncased-optuna-finetuned-cola + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_qa_classification_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_qa_classification_en.md new file mode 100644 index 00000000000000..01721788d0b1f8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_qa_classification_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_qa_classification BertForSequenceClassification from kgourgou +author: John Snow Labs +name: bert_base_uncased_qa_classification +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_qa_classification` is a English model originally trained by kgourgou. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_qa_classification_en_5.5.0_3.0_1727285884269.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_qa_classification_en_5.5.0_3.0_1727285884269.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_qa_classification","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_qa_classification", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_qa_classification| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/kgourgou/bert-base-uncased-QA-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_qnli_yoshitomo_matsubara_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_qnli_yoshitomo_matsubara_en.md new file mode 100644 index 00000000000000..070203175f2769 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_qnli_yoshitomo_matsubara_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_qnli_yoshitomo_matsubara BertForSequenceClassification from yoshitomo-matsubara +author: John Snow Labs +name: bert_base_uncased_qnli_yoshitomo_matsubara +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_qnli_yoshitomo_matsubara` is a English model originally trained by yoshitomo-matsubara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_qnli_yoshitomo_matsubara_en_5.5.0_3.0_1727278867498.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_qnli_yoshitomo_matsubara_en_5.5.0_3.0_1727278867498.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_qnli_yoshitomo_matsubara","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_qnli_yoshitomo_matsubara", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_qnli_yoshitomo_matsubara| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/yoshitomo-matsubara/bert-base-uncased-qnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_qqp_epochs_2_lr_0_0001_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_qqp_epochs_2_lr_0_0001_en.md new file mode 100644 index 00000000000000..ba3481252694a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_qqp_epochs_2_lr_0_0001_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_qqp_epochs_2_lr_0_0001 BertForSequenceClassification from prateeky2806 +author: John Snow Labs +name: bert_base_uncased_qqp_epochs_2_lr_0_0001 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_qqp_epochs_2_lr_0_0001` is a English model originally trained by prateeky2806. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_qqp_epochs_2_lr_0_0001_en_5.5.0_3.0_1727266030400.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_qqp_epochs_2_lr_0_0001_en_5.5.0_3.0_1727266030400.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_qqp_epochs_2_lr_0_0001","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_qqp_epochs_2_lr_0_0001", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_qqp_epochs_2_lr_0_0001| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/prateeky2806/bert-base-uncased-qqp-epochs-2-lr-0.0001 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_review1_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_review1_en.md new file mode 100644 index 00000000000000..d9a52a8ddf6450 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_review1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_review1 BertForSequenceClassification from Iresh88 +author: John Snow Labs +name: bert_base_uncased_review1 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_review1` is a English model originally trained by Iresh88. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_review1_en_5.5.0_3.0_1727267657446.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_review1_en_5.5.0_3.0_1727267657446.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_review1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_review1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_review1| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Iresh88/bert-base-uncased-review1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_sst2_epochs_2_lr_0_0001_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_sst2_epochs_2_lr_0_0001_en.md new file mode 100644 index 00000000000000..bbb1b3eaa0409e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_sst2_epochs_2_lr_0_0001_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_sst2_epochs_2_lr_0_0001 BertForSequenceClassification from prateeky2806 +author: John Snow Labs +name: bert_base_uncased_sst2_epochs_2_lr_0_0001 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_sst2_epochs_2_lr_0_0001` is a English model originally trained by prateeky2806. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst2_epochs_2_lr_0_0001_en_5.5.0_3.0_1727306198154.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst2_epochs_2_lr_0_0001_en_5.5.0_3.0_1727306198154.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_sst2_epochs_2_lr_0_0001","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_sst2_epochs_2_lr_0_0001", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_sst2_epochs_2_lr_0_0001| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/prateeky2806/bert-base-uncased-sst2-epochs-2-lr-0.0001 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_sst2_from_bert_large_uncased_sst2_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_sst2_from_bert_large_uncased_sst2_en.md new file mode 100644 index 00000000000000..8fc338a381991e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_sst2_from_bert_large_uncased_sst2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_sst2_from_bert_large_uncased_sst2 BertForSequenceClassification from yoshitomo-matsubara +author: John Snow Labs +name: bert_base_uncased_sst2_from_bert_large_uncased_sst2 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_sst2_from_bert_large_uncased_sst2` is a English model originally trained by yoshitomo-matsubara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst2_from_bert_large_uncased_sst2_en_5.5.0_3.0_1727307468945.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst2_from_bert_large_uncased_sst2_en_5.5.0_3.0_1727307468945.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_sst2_from_bert_large_uncased_sst2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_sst2_from_bert_large_uncased_sst2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_sst2_from_bert_large_uncased_sst2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/yoshitomo-matsubara/bert-base-uncased-sst2_from_bert-large-uncased-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_sst2_from_bert_large_uncased_sst2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_sst2_from_bert_large_uncased_sst2_pipeline_en.md new file mode 100644 index 00000000000000..af6ca7d733384b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_sst2_from_bert_large_uncased_sst2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_sst2_from_bert_large_uncased_sst2_pipeline pipeline BertForSequenceClassification from yoshitomo-matsubara +author: John Snow Labs +name: bert_base_uncased_sst2_from_bert_large_uncased_sst2_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_sst2_from_bert_large_uncased_sst2_pipeline` is a English model originally trained by yoshitomo-matsubara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst2_from_bert_large_uncased_sst2_pipeline_en_5.5.0_3.0_1727307490789.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst2_from_bert_large_uncased_sst2_pipeline_en_5.5.0_3.0_1727307490789.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_sst2_from_bert_large_uncased_sst2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_sst2_from_bert_large_uncased_sst2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_sst2_from_bert_large_uncased_sst2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/yoshitomo-matsubara/bert-base-uncased-sst2_from_bert-large-uncased-sst2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_sst_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_sst_pipeline_en.md new file mode 100644 index 00000000000000..0342c34364d7a7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_sst_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_sst_pipeline pipeline BertForSequenceClassification from pmthangk09 +author: John Snow Labs +name: bert_base_uncased_sst_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_sst_pipeline` is a English model originally trained by pmthangk09. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst_pipeline_en_5.5.0_3.0_1727278165896.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst_pipeline_en_5.5.0_3.0_1727278165896.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_sst_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_sst_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_sst_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/pmthangk09/bert-base-uncased-sst + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_token_itr0_0_0001_train_all_test_null__second_train_set_null_false_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_token_itr0_0_0001_train_all_test_null__second_train_set_null_false_pipeline_en.md new file mode 100644 index 00000000000000..aa6026d59f50bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_token_itr0_0_0001_train_all_test_null__second_train_set_null_false_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_token_itr0_0_0001_train_all_test_null__second_train_set_null_false_pipeline pipeline BertForTokenClassification from ali2066 +author: John Snow Labs +name: bert_base_uncased_token_itr0_0_0001_train_all_test_null__second_train_set_null_false_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_token_itr0_0_0001_train_all_test_null__second_train_set_null_false_pipeline` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_token_itr0_0_0001_train_all_test_null__second_train_set_null_false_pipeline_en_5.5.0_3.0_1727260609540.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_token_itr0_0_0001_train_all_test_null__second_train_set_null_false_pipeline_en_5.5.0_3.0_1727260609540.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_token_itr0_0_0001_train_all_test_null__second_train_set_null_false_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_token_itr0_0_0001_train_all_test_null__second_train_set_null_false_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_token_itr0_0_0001_train_all_test_null__second_train_set_null_false_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/ali2066/bert-base-uncased_token_itr0_0.0001_TRAIN_all_TEST_null__second_train_set_NULL_False + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_top_pruned_stsb_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_top_pruned_stsb_en.md new file mode 100644 index 00000000000000..169e792641112a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_top_pruned_stsb_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_top_pruned_stsb BertForSequenceClassification from senfu +author: John Snow Labs +name: bert_base_uncased_top_pruned_stsb +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_top_pruned_stsb` is a English model originally trained by senfu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_top_pruned_stsb_en_5.5.0_3.0_1727289078804.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_top_pruned_stsb_en_5.5.0_3.0_1727289078804.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_top_pruned_stsb","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_top_pruned_stsb", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_top_pruned_stsb| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/senfu/bert-base-uncased-top-pruned-stsb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_top_pruned_stsb_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_top_pruned_stsb_pipeline_en.md new file mode 100644 index 00000000000000..ab4e9261c13ecc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_top_pruned_stsb_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_top_pruned_stsb_pipeline pipeline BertForSequenceClassification from senfu +author: John Snow Labs +name: bert_base_uncased_top_pruned_stsb_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_top_pruned_stsb_pipeline` is a English model originally trained by senfu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_top_pruned_stsb_pipeline_en_5.5.0_3.0_1727289099975.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_top_pruned_stsb_pipeline_en_5.5.0_3.0_1727289099975.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_top_pruned_stsb_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_top_pruned_stsb_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_top_pruned_stsb_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/senfu/bert-base-uncased-top-pruned-stsb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_ver1_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_ver1_en.md new file mode 100644 index 00000000000000..735c360117f5df --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_uncased_ver1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_ver1 BertForSequenceClassification from chanchongwei +author: John Snow Labs +name: bert_base_uncased_ver1 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_ver1` is a English model originally trained by chanchongwei. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_ver1_en_5.5.0_3.0_1727290881496.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_ver1_en_5.5.0_3.0_1727290881496.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_ver1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_ver1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_ver1| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/chanchongwei/bert-base-uncased-ver1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_base_vietnamese_vi.md b/docs/_posts/ahmedlone127/2024-09-25-bert_base_vietnamese_vi.md new file mode 100644 index 00000000000000..fc7fc792507f5a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_base_vietnamese_vi.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Vietnamese bert_base_vietnamese BertForSequenceClassification from ndbao2002 +author: John Snow Labs +name: bert_base_vietnamese +date: 2024-09-25 +tags: [vi, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: vi +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_vietnamese` is a Vietnamese model originally trained by ndbao2002. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_vietnamese_vi_5.5.0_3.0_1727278718653.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_vietnamese_vi_5.5.0_3.0_1727278718653.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_vietnamese","vi") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_vietnamese", "vi") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_vietnamese| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|vi| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ndbao2002/bert-base-vi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_based_burmese_securityllm_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_based_burmese_securityllm_pipeline_en.md new file mode 100644 index 00000000000000..29fcb65cb57e47 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_based_burmese_securityllm_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_based_burmese_securityllm_pipeline pipeline BertForSequenceClassification from GeorgeNhj +author: John Snow Labs +name: bert_based_burmese_securityllm_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_based_burmese_securityllm_pipeline` is a English model originally trained by GeorgeNhj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_based_burmese_securityllm_pipeline_en_5.5.0_3.0_1727303859982.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_based_burmese_securityllm_pipeline_en_5.5.0_3.0_1727303859982.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_based_burmese_securityllm_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_based_burmese_securityllm_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_based_burmese_securityllm_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/GeorgeNhj/BERT_based_My_SecurityLLM + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_based_uncased_sst2_e5_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_based_uncased_sst2_e5_en.md new file mode 100644 index 00000000000000..f9166b3949b3ec --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_based_uncased_sst2_e5_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_based_uncased_sst2_e5 BertForSequenceClassification from EhsanAghazadeh +author: John Snow Labs +name: bert_based_uncased_sst2_e5 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_based_uncased_sst2_e5` is a English model originally trained by EhsanAghazadeh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_based_uncased_sst2_e5_en_5.5.0_3.0_1727306492905.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_based_uncased_sst2_e5_en_5.5.0_3.0_1727306492905.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_based_uncased_sst2_e5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_based_uncased_sst2_e5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_based_uncased_sst2_e5| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/EhsanAghazadeh/bert-based-uncased-sst2-e5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_based_uncased_sst2_e5_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_based_uncased_sst2_e5_pipeline_en.md new file mode 100644 index 00000000000000..923037b607a9d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_based_uncased_sst2_e5_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_based_uncased_sst2_e5_pipeline pipeline BertForSequenceClassification from EhsanAghazadeh +author: John Snow Labs +name: bert_based_uncased_sst2_e5_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_based_uncased_sst2_e5_pipeline` is a English model originally trained by EhsanAghazadeh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_based_uncased_sst2_e5_pipeline_en_5.5.0_3.0_1727306514723.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_based_uncased_sst2_e5_pipeline_en_5.5.0_3.0_1727306514723.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_based_uncased_sst2_e5_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_based_uncased_sst2_e5_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_based_uncased_sst2_e5_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/EhsanAghazadeh/bert-based-uncased-sst2-e5 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_classification_persian_emotion_fa.md b/docs/_posts/ahmedlone127/2024-09-25-bert_classification_persian_emotion_fa.md new file mode 100644 index 00000000000000..2b4460ba2b9114 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_classification_persian_emotion_fa.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Persian bert_classification_persian_emotion BertForSequenceClassification from NLPclass +author: John Snow Labs +name: bert_classification_persian_emotion +date: 2024-09-25 +tags: [fa, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: fa +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_classification_persian_emotion` is a Persian model originally trained by NLPclass. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classification_persian_emotion_fa_5.5.0_3.0_1727303213617.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classification_persian_emotion_fa_5.5.0_3.0_1727303213617.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_classification_persian_emotion","fa") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_classification_persian_emotion", "fa") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classification_persian_emotion| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|fa| +|Size:|608.7 MB| + +## References + +https://huggingface.co/NLPclass/bert_classification_persian_emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_classification_persian_emotion_pipeline_fa.md b/docs/_posts/ahmedlone127/2024-09-25-bert_classification_persian_emotion_pipeline_fa.md new file mode 100644 index 00000000000000..263242e001b31f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_classification_persian_emotion_pipeline_fa.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Persian bert_classification_persian_emotion_pipeline pipeline BertForSequenceClassification from NLPclass +author: John Snow Labs +name: bert_classification_persian_emotion_pipeline +date: 2024-09-25 +tags: [fa, open_source, pipeline, onnx] +task: Text Classification +language: fa +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_classification_persian_emotion_pipeline` is a Persian model originally trained by NLPclass. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classification_persian_emotion_pipeline_fa_5.5.0_3.0_1727303246242.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classification_persian_emotion_pipeline_fa_5.5.0_3.0_1727303246242.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_classification_persian_emotion_pipeline", lang = "fa") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_classification_persian_emotion_pipeline", lang = "fa") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classification_persian_emotion_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|fa| +|Size:|608.7 MB| + +## References + +https://huggingface.co/NLPclass/bert_classification_persian_emotion + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_daigt_models_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_daigt_models_en.md new file mode 100644 index 00000000000000..46af07a2014a97 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_daigt_models_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_daigt_models BertForSequenceClassification from zeyadusf +author: John Snow Labs +name: bert_daigt_models +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_daigt_models` is a English model originally trained by zeyadusf. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_daigt_models_en_5.5.0_3.0_1727293771773.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_daigt_models_en_5.5.0_3.0_1727293771773.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_daigt_models","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_daigt_models", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_daigt_models| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/zeyadusf/bert-DAIGT-MODELS \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_daigt_models_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_daigt_models_pipeline_en.md new file mode 100644 index 00000000000000..832815280a3735 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_daigt_models_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_daigt_models_pipeline pipeline BertForSequenceClassification from zeyadusf +author: John Snow Labs +name: bert_daigt_models_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_daigt_models_pipeline` is a English model originally trained by zeyadusf. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_daigt_models_pipeline_en_5.5.0_3.0_1727293793651.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_daigt_models_pipeline_en_5.5.0_3.0_1727293793651.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_daigt_models_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_daigt_models_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_daigt_models_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/zeyadusf/bert-DAIGT-MODELS + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_finetuned_emotion_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_finetuned_emotion_en.md new file mode 100644 index 00000000000000..8311ef3cc0001c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_finetuned_emotion_en.md @@ -0,0 +1,98 @@ +--- +layout: model +title: English bert_finetuned_emotion BertForSequenceClassification from PascalY +author: John Snow Labs +name: bert_finetuned_emotion +date: 2024-09-25 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_emotion` is a English model originally trained by PascalY. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_emotion_en_5.5.0_3.0_1727291628041.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_emotion_en_5.5.0_3.0_1727291628041.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_emotion","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_emotion","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_emotion| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +References + +https://huggingface.co/PascalY/bert-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_finetuned_emotion_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_finetuned_emotion_pipeline_en.md new file mode 100644 index 00000000000000..50392986843d02 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_finetuned_emotion_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_finetuned_emotion_pipeline pipeline BertForSequenceClassification from IsmaelMousa +author: John Snow Labs +name: bert_finetuned_emotion_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_emotion_pipeline` is a English model originally trained by IsmaelMousa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_emotion_pipeline_en_5.5.0_3.0_1727291659349.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_emotion_pipeline_en_5.5.0_3.0_1727291659349.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_finetuned_emotion_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_finetuned_emotion_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_emotion_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/IsmaelMousa/bert-finetuned-emotion + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_finetuned_news_classifier_portuguese_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_finetuned_news_classifier_portuguese_en.md new file mode 100644 index 00000000000000..09f50c804f7ca7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_finetuned_news_classifier_portuguese_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_finetuned_news_classifier_portuguese BertForSequenceClassification from ClaudianoLeonardo +author: John Snow Labs +name: bert_finetuned_news_classifier_portuguese +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_news_classifier_portuguese` is a English model originally trained by ClaudianoLeonardo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_news_classifier_portuguese_en_5.5.0_3.0_1727291625352.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_news_classifier_portuguese_en_5.5.0_3.0_1727291625352.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_news_classifier_portuguese","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_news_classifier_portuguese", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_news_classifier_portuguese| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ClaudianoLeonardo/bert-finetuned_news_classifier-portuguese \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_finetuned_news_classifier_portuguese_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_finetuned_news_classifier_portuguese_pipeline_en.md new file mode 100644 index 00000000000000..4ba75b550f060e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_finetuned_news_classifier_portuguese_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_finetuned_news_classifier_portuguese_pipeline pipeline BertForSequenceClassification from ClaudianoLeonardo +author: John Snow Labs +name: bert_finetuned_news_classifier_portuguese_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_news_classifier_portuguese_pipeline` is a English model originally trained by ClaudianoLeonardo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_news_classifier_portuguese_pipeline_en_5.5.0_3.0_1727291657462.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_news_classifier_portuguese_pipeline_en_5.5.0_3.0_1727291657462.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_finetuned_news_classifier_portuguese_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_finetuned_news_classifier_portuguese_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_news_classifier_portuguese_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ClaudianoLeonardo/bert-finetuned_news_classifier-portuguese + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_finetuned_semitic_languages_eval_english_sarasarasara_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_finetuned_semitic_languages_eval_english_sarasarasara_en.md new file mode 100644 index 00000000000000..358f6e41b43994 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_finetuned_semitic_languages_eval_english_sarasarasara_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_finetuned_semitic_languages_eval_english_sarasarasara BertForSequenceClassification from sarasarasara +author: John Snow Labs +name: bert_finetuned_semitic_languages_eval_english_sarasarasara +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_semitic_languages_eval_english_sarasarasara` is a English model originally trained by sarasarasara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_semitic_languages_eval_english_sarasarasara_en_5.5.0_3.0_1727308353349.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_semitic_languages_eval_english_sarasarasara_en_5.5.0_3.0_1727308353349.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_semitic_languages_eval_english_sarasarasara","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_semitic_languages_eval_english_sarasarasara", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_semitic_languages_eval_english_sarasarasara| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/sarasarasara/bert-finetuned-sem_eval-english \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_finetuned_semitic_languages_eval_english_sarasarasara_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_finetuned_semitic_languages_eval_english_sarasarasara_pipeline_en.md new file mode 100644 index 00000000000000..e472eb99e2d0e7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_finetuned_semitic_languages_eval_english_sarasarasara_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_finetuned_semitic_languages_eval_english_sarasarasara_pipeline pipeline BertForSequenceClassification from sarasarasara +author: John Snow Labs +name: bert_finetuned_semitic_languages_eval_english_sarasarasara_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_semitic_languages_eval_english_sarasarasara_pipeline` is a English model originally trained by sarasarasara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_semitic_languages_eval_english_sarasarasara_pipeline_en_5.5.0_3.0_1727308375040.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_semitic_languages_eval_english_sarasarasara_pipeline_en_5.5.0_3.0_1727308375040.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_finetuned_semitic_languages_eval_english_sarasarasara_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_finetuned_semitic_languages_eval_english_sarasarasara_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_semitic_languages_eval_english_sarasarasara_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/sarasarasara/bert-finetuned-sem_eval-english + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_finetuned_weibo_luobokuaipao_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_finetuned_weibo_luobokuaipao_en.md new file mode 100644 index 00000000000000..298af023ef8359 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_finetuned_weibo_luobokuaipao_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_finetuned_weibo_luobokuaipao BertForSequenceClassification from wsqstar +author: John Snow Labs +name: bert_finetuned_weibo_luobokuaipao +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_weibo_luobokuaipao` is a English model originally trained by wsqstar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_weibo_luobokuaipao_en_5.5.0_3.0_1727298790588.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_weibo_luobokuaipao_en_5.5.0_3.0_1727298790588.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_weibo_luobokuaipao","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_weibo_luobokuaipao", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_weibo_luobokuaipao| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.3 MB| + +## References + +https://huggingface.co/wsqstar/bert-finetuned-weibo-luobokuaipao \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_finetuned_weibo_luobokuaipao_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_finetuned_weibo_luobokuaipao_pipeline_en.md new file mode 100644 index 00000000000000..c7d0832277f1f3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_finetuned_weibo_luobokuaipao_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_finetuned_weibo_luobokuaipao_pipeline pipeline BertForSequenceClassification from wsqstar +author: John Snow Labs +name: bert_finetuned_weibo_luobokuaipao_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_weibo_luobokuaipao_pipeline` is a English model originally trained by wsqstar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_weibo_luobokuaipao_pipeline_en_5.5.0_3.0_1727298811142.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_weibo_luobokuaipao_pipeline_en_5.5.0_3.0_1727298811142.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_finetuned_weibo_luobokuaipao_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_finetuned_weibo_luobokuaipao_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_weibo_luobokuaipao_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|383.3 MB| + +## References + +https://huggingface.co/wsqstar/bert-finetuned-weibo-luobokuaipao + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_finetuning_test_qingtan007_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_finetuning_test_qingtan007_en.md new file mode 100644 index 00000000000000..0d39b0011349fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_finetuning_test_qingtan007_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_finetuning_test_qingtan007 BertForSequenceClassification from qingtan007 +author: John Snow Labs +name: bert_finetuning_test_qingtan007 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuning_test_qingtan007` is a English model originally trained by qingtan007. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuning_test_qingtan007_en_5.5.0_3.0_1727305913121.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuning_test_qingtan007_en_5.5.0_3.0_1727305913121.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuning_test_qingtan007","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuning_test_qingtan007", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuning_test_qingtan007| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/qingtan007/bert_finetuning_test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_finetuning_test_qingtan007_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_finetuning_test_qingtan007_pipeline_en.md new file mode 100644 index 00000000000000..65846510d1a6b5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_finetuning_test_qingtan007_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_finetuning_test_qingtan007_pipeline pipeline BertForSequenceClassification from qingtan007 +author: John Snow Labs +name: bert_finetuning_test_qingtan007_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuning_test_qingtan007_pipeline` is a English model originally trained by qingtan007. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuning_test_qingtan007_pipeline_en_5.5.0_3.0_1727305937632.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuning_test_qingtan007_pipeline_en_5.5.0_3.0_1727305937632.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_finetuning_test_qingtan007_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_finetuning_test_qingtan007_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuning_test_qingtan007_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/qingtan007/bert_finetuning_test + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_finnish_sentiment_analysis_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_finnish_sentiment_analysis_pipeline_en.md new file mode 100644 index 00000000000000..89acc9758200af --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_finnish_sentiment_analysis_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_finnish_sentiment_analysis_pipeline pipeline BertForSequenceClassification from nisancoskun +author: John Snow Labs +name: bert_finnish_sentiment_analysis_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finnish_sentiment_analysis_pipeline` is a English model originally trained by nisancoskun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finnish_sentiment_analysis_pipeline_en_5.5.0_3.0_1727263548236.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finnish_sentiment_analysis_pipeline_en_5.5.0_3.0_1727263548236.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_finnish_sentiment_analysis_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_finnish_sentiment_analysis_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finnish_sentiment_analysis_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|466.9 MB| + +## References + +https://huggingface.co/nisancoskun/bert-finnish-sentiment-analysis + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_hatespeechrecognition_german_de.md b/docs/_posts/ahmedlone127/2024-09-25-bert_hatespeechrecognition_german_de.md new file mode 100644 index 00000000000000..f0b5a7fdc702dc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_hatespeechrecognition_german_de.md @@ -0,0 +1,94 @@ +--- +layout: model +title: German bert_hatespeechrecognition_german BertForSequenceClassification from jorgeortizv +author: John Snow Labs +name: bert_hatespeechrecognition_german +date: 2024-09-25 +tags: [de, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: de +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_hatespeechrecognition_german` is a German model originally trained by jorgeortizv. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_hatespeechrecognition_german_de_5.5.0_3.0_1727305141301.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_hatespeechrecognition_german_de_5.5.0_3.0_1727305141301.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_hatespeechrecognition_german","de") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_hatespeechrecognition_german", "de") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_hatespeechrecognition_german| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|de| +|Size:|409.1 MB| + +## References + +https://huggingface.co/jorgeortizv/BERT-hateSpeechRecognition-German \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_large_book_genre_classification_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_large_book_genre_classification_en.md new file mode 100644 index 00000000000000..ac79ea057933f2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_large_book_genre_classification_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_large_book_genre_classification BertForSequenceClassification from TenzinGayche +author: John Snow Labs +name: bert_large_book_genre_classification +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_book_genre_classification` is a English model originally trained by TenzinGayche. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_book_genre_classification_en_5.5.0_3.0_1727306575800.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_book_genre_classification_en_5.5.0_3.0_1727306575800.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_book_genre_classification","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_book_genre_classification", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_book_genre_classification| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/TenzinGayche/Bert-large-book-genre-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_large_cased_fever_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_large_cased_fever_en.md new file mode 100644 index 00000000000000..9256d681d69592 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_large_cased_fever_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_large_cased_fever BertForSequenceClassification from sagnikrayc +author: John Snow Labs +name: bert_large_cased_fever +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_cased_fever` is a English model originally trained by sagnikrayc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_cased_fever_en_5.5.0_3.0_1727286936866.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_cased_fever_en_5.5.0_3.0_1727286936866.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_cased_fever","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_cased_fever", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_cased_fever| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/sagnikrayc/bert-large-cased-fever \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_large_cased_fever_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_large_cased_fever_pipeline_en.md new file mode 100644 index 00000000000000..b3311a38723bbd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_large_cased_fever_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_large_cased_fever_pipeline pipeline BertForSequenceClassification from sagnikrayc +author: John Snow Labs +name: bert_large_cased_fever_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_cased_fever_pipeline` is a English model originally trained by sagnikrayc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_cased_fever_pipeline_en_5.5.0_3.0_1727287003215.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_cased_fever_pipeline_en_5.5.0_3.0_1727287003215.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_large_cased_fever_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_large_cased_fever_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_cased_fever_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/sagnikrayc/bert-large-cased-fever + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_large_portuguese_archive_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_large_portuguese_archive_en.md new file mode 100644 index 00000000000000..669b8a6f9fbb4c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_large_portuguese_archive_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_large_portuguese_archive BertForTokenClassification from lfcc +author: John Snow Labs +name: bert_large_portuguese_archive +date: 2024-09-25 +tags: [en, open_source, onnx, token_classification, bert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_portuguese_archive` is a English model originally trained by lfcc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_portuguese_archive_en_5.5.0_3.0_1727270916661.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_portuguese_archive_en_5.5.0_3.0_1727270916661.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = BertForTokenClassification.pretrained("bert_large_portuguese_archive","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_large_portuguese_archive", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_portuguese_archive| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/lfcc/bert-large-pt-archive \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_large_sst2_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_large_sst2_en.md new file mode 100644 index 00000000000000..0018d76e0ea604 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_large_sst2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_large_sst2 BertForSequenceClassification from Cheng98 +author: John Snow Labs +name: bert_large_sst2 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_sst2` is a English model originally trained by Cheng98. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_sst2_en_5.5.0_3.0_1727297317648.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_sst2_en_5.5.0_3.0_1727297317648.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_sst2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_sst2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_sst2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/Cheng98/bert-large-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_large_sst2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_large_sst2_pipeline_en.md new file mode 100644 index 00000000000000..d244c811f1d967 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_large_sst2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_large_sst2_pipeline pipeline BertForSequenceClassification from Cheng98 +author: John Snow Labs +name: bert_large_sst2_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_sst2_pipeline` is a English model originally trained by Cheng98. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_sst2_pipeline_en_5.5.0_3.0_1727297384450.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_sst2_pipeline_en_5.5.0_3.0_1727297384450.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_large_sst2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_large_sst2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_sst2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/Cheng98/bert-large-sst2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_large_uncased_deletion_multiclass_complete_final_v2_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_large_uncased_deletion_multiclass_complete_final_v2_en.md new file mode 100644 index 00000000000000..0679b58517bc5f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_large_uncased_deletion_multiclass_complete_final_v2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_large_uncased_deletion_multiclass_complete_final_v2 BertForSequenceClassification from research-dump +author: John Snow Labs +name: bert_large_uncased_deletion_multiclass_complete_final_v2 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_deletion_multiclass_complete_final_v2` is a English model originally trained by research-dump. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_deletion_multiclass_complete_final_v2_en_5.5.0_3.0_1727288141519.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_deletion_multiclass_complete_final_v2_en_5.5.0_3.0_1727288141519.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_deletion_multiclass_complete_final_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_deletion_multiclass_complete_final_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_deletion_multiclass_complete_final_v2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/research-dump/bert-large-uncased_deletion_multiclass_complete_final_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_large_uncased_finetuned_edos_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_large_uncased_finetuned_edos_en.md new file mode 100644 index 00000000000000..8c995945ee853a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_large_uncased_finetuned_edos_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_large_uncased_finetuned_edos BertForSequenceClassification from reinforz +author: John Snow Labs +name: bert_large_uncased_finetuned_edos +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_finetuned_edos` is a English model originally trained by reinforz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_finetuned_edos_en_5.5.0_3.0_1727269327376.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_finetuned_edos_en_5.5.0_3.0_1727269327376.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_finetuned_edos","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_finetuned_edos", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_finetuned_edos| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/reinforz/bert-large-uncased-finetuned-edos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_large_uncased_stsb_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_large_uncased_stsb_en.md new file mode 100644 index 00000000000000..196d32ce8a78d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_large_uncased_stsb_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_large_uncased_stsb BertForSequenceClassification from yoshitomo-matsubara +author: John Snow Labs +name: bert_large_uncased_stsb +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_stsb` is a English model originally trained by yoshitomo-matsubara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_stsb_en_5.5.0_3.0_1727293117081.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_stsb_en_5.5.0_3.0_1727293117081.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_stsb","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_stsb", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_stsb| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/yoshitomo-matsubara/bert-large-uncased-stsb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_large_uncased_stsb_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_large_uncased_stsb_pipeline_en.md new file mode 100644 index 00000000000000..77def2b4515700 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_large_uncased_stsb_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_large_uncased_stsb_pipeline pipeline BertForSequenceClassification from yoshitomo-matsubara +author: John Snow Labs +name: bert_large_uncased_stsb_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_stsb_pipeline` is a English model originally trained by yoshitomo-matsubara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_stsb_pipeline_en_5.5.0_3.0_1727293179697.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_stsb_pipeline_en_5.5.0_3.0_1727293179697.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_large_uncased_stsb_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_large_uncased_stsb_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_stsb_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/yoshitomo-matsubara/bert-large-uncased-stsb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_mrpc_glue_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_mrpc_glue_en.md new file mode 100644 index 00000000000000..2c7567cc4a4991 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_mrpc_glue_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_mrpc_glue BertForSequenceClassification from itabrez +author: John Snow Labs +name: bert_mrpc_glue +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_mrpc_glue` is a English model originally trained by itabrez. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_mrpc_glue_en_5.5.0_3.0_1727307792599.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_mrpc_glue_en_5.5.0_3.0_1727307792599.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_mrpc_glue","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_mrpc_glue", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_mrpc_glue| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/itabrez/bert-mrpc-glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_political_leaning_italian_it.md b/docs/_posts/ahmedlone127/2024-09-25-bert_political_leaning_italian_it.md new file mode 100644 index 00000000000000..edae4436239dc2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_political_leaning_italian_it.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Italian bert_political_leaning_italian BertForSequenceClassification from MattiaSangermano +author: John Snow Labs +name: bert_political_leaning_italian +date: 2024-09-25 +tags: [it, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: it +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_political_leaning_italian` is a Italian model originally trained by MattiaSangermano. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_political_leaning_italian_it_5.5.0_3.0_1727294229559.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_political_leaning_italian_it_5.5.0_3.0_1727294229559.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_political_leaning_italian","it") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_political_leaning_italian", "it") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_political_leaning_italian| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|it| +|Size:|414.8 MB| + +## References + +https://huggingface.co/MattiaSangermano/bert-political-leaning-it \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_political_leaning_italian_pipeline_it.md b/docs/_posts/ahmedlone127/2024-09-25-bert_political_leaning_italian_pipeline_it.md new file mode 100644 index 00000000000000..bc53653354cb57 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_political_leaning_italian_pipeline_it.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Italian bert_political_leaning_italian_pipeline pipeline BertForSequenceClassification from MattiaSangermano +author: John Snow Labs +name: bert_political_leaning_italian_pipeline +date: 2024-09-25 +tags: [it, open_source, pipeline, onnx] +task: Text Classification +language: it +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_political_leaning_italian_pipeline` is a Italian model originally trained by MattiaSangermano. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_political_leaning_italian_pipeline_it_5.5.0_3.0_1727294252451.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_political_leaning_italian_pipeline_it_5.5.0_3.0_1727294252451.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_political_leaning_italian_pipeline", lang = "it") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_political_leaning_italian_pipeline", lang = "it") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_political_leaning_italian_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|it| +|Size:|414.9 MB| + +## References + +https://huggingface.co/MattiaSangermano/bert-political-leaning-it + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_pooling_based_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_pooling_based_pipeline_en.md new file mode 100644 index 00000000000000..1d0c796c9dbe07 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_pooling_based_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_pooling_based_pipeline pipeline BertForSequenceClassification from elifcen +author: John Snow Labs +name: bert_pooling_based_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_pooling_based_pipeline` is a English model originally trained by elifcen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_pooling_based_pipeline_en_5.5.0_3.0_1727284856124.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_pooling_based_pipeline_en_5.5.0_3.0_1727284856124.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_pooling_based_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_pooling_based_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_pooling_based_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/elifcen/bert-pooling-based + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_pretrained_wikitext_2_raw_v1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_pretrained_wikitext_2_raw_v1_pipeline_en.md new file mode 100644 index 00000000000000..59e0540187f7cf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_pretrained_wikitext_2_raw_v1_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_pretrained_wikitext_2_raw_v1_pipeline pipeline BertEmbeddings from dimpo +author: John Snow Labs +name: bert_pretrained_wikitext_2_raw_v1_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_pretrained_wikitext_2_raw_v1_pipeline` is a English model originally trained by dimpo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_pretrained_wikitext_2_raw_v1_pipeline_en_5.5.0_3.0_1727256213343.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_pretrained_wikitext_2_raw_v1_pipeline_en_5.5.0_3.0_1727256213343.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_pretrained_wikitext_2_raw_v1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_pretrained_wikitext_2_raw_v1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_pretrained_wikitext_2_raw_v1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.9 MB| + +## References + +https://huggingface.co/dimpo/bert-pretrained-wikitext-2-raw-v1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_protein_classifier_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_protein_classifier_pipeline_en.md new file mode 100644 index 00000000000000..923f81071d1015 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_protein_classifier_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_protein_classifier_pipeline pipeline BertForSequenceClassification from oohtmeel +author: John Snow Labs +name: bert_protein_classifier_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_protein_classifier_pipeline` is a English model originally trained by oohtmeel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_protein_classifier_pipeline_en_5.5.0_3.0_1727266466500.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_protein_classifier_pipeline_en_5.5.0_3.0_1727266466500.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_protein_classifier_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_protein_classifier_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_protein_classifier_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/oohtmeel/Bert_protein_classifier + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_sanskrit_saskta_test_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_sanskrit_saskta_test_en.md new file mode 100644 index 00000000000000..355127eece55da --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_sanskrit_saskta_test_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_sanskrit_saskta_test BertForSequenceClassification from ilham14 +author: John Snow Labs +name: bert_sanskrit_saskta_test +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sanskrit_saskta_test` is a English model originally trained by ilham14. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sanskrit_saskta_test_en_5.5.0_3.0_1727273584258.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sanskrit_saskta_test_en_5.5.0_3.0_1727273584258.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sanskrit_saskta_test","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sanskrit_saskta_test", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sanskrit_saskta_test| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|423.1 MB| + +## References + +https://huggingface.co/ilham14/BERT-SA-Test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_small_64_finetuned_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_small_64_finetuned_pipeline_en.md new file mode 100644 index 00000000000000..6f69f98a99ddcd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_small_64_finetuned_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_small_64_finetuned_pipeline pipeline BertForSequenceClassification from ryantaw +author: John Snow Labs +name: bert_small_64_finetuned_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_small_64_finetuned_pipeline` is a English model originally trained by ryantaw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_small_64_finetuned_pipeline_en_5.5.0_3.0_1727291864586.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_small_64_finetuned_pipeline_en_5.5.0_3.0_1727291864586.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_small_64_finetuned_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_small_64_finetuned_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_small_64_finetuned_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|108.0 MB| + +## References + +https://huggingface.co/ryantaw/bert-small-64-finetuned + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_sst2_padding70model_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_sst2_padding70model_en.md new file mode 100644 index 00000000000000..500105fade57d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_sst2_padding70model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_sst2_padding70model BertForSequenceClassification from Realgon +author: John Snow Labs +name: bert_sst2_padding70model +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sst2_padding70model` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sst2_padding70model_en_5.5.0_3.0_1727287018846.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sst2_padding70model_en_5.5.0_3.0_1727287018846.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sst2_padding70model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sst2_padding70model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sst2_padding70model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Realgon/bert_sst2_padding70model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_sst2_padding90model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_sst2_padding90model_pipeline_en.md new file mode 100644 index 00000000000000..01af94bfde8706 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_sst2_padding90model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_sst2_padding90model_pipeline pipeline BertForSequenceClassification from Realgon +author: John Snow Labs +name: bert_sst2_padding90model_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sst2_padding90model_pipeline` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sst2_padding90model_pipeline_en_5.5.0_3.0_1727307641366.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sst2_padding90model_pipeline_en_5.5.0_3.0_1727307641366.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_sst2_padding90model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_sst2_padding90model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sst2_padding90model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Realgon/bert_sst2_padding90model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_sst5_padding50model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_sst5_padding50model_pipeline_en.md new file mode 100644 index 00000000000000..d8a6ab5e2a652a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_sst5_padding50model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_sst5_padding50model_pipeline pipeline BertForSequenceClassification from Realgon +author: John Snow Labs +name: bert_sst5_padding50model_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sst5_padding50model_pipeline` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sst5_padding50model_pipeline_en_5.5.0_3.0_1727287970205.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sst5_padding50model_pipeline_en_5.5.0_3.0_1727287970205.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_sst5_padding50model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_sst5_padding50model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sst5_padding50model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Realgon/bert_sst5_padding50model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_sst5_padding90model_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_sst5_padding90model_en.md new file mode 100644 index 00000000000000..a5be73cb8ee585 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_sst5_padding90model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_sst5_padding90model BertForSequenceClassification from Realgon +author: John Snow Labs +name: bert_sst5_padding90model +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sst5_padding90model` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sst5_padding90model_en_5.5.0_3.0_1727305778089.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sst5_padding90model_en_5.5.0_3.0_1727305778089.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sst5_padding90model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sst5_padding90model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sst5_padding90model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Realgon/bert_sst5_padding90model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_subjective_amazon_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_subjective_amazon_en.md new file mode 100644 index 00000000000000..0d93a781eb1c34 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_subjective_amazon_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_subjective_amazon BertForSequenceClassification from MTOrange +author: John Snow Labs +name: bert_subjective_amazon +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_subjective_amazon` is a English model originally trained by MTOrange. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_subjective_amazon_en_5.5.0_3.0_1727294738490.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_subjective_amazon_en_5.5.0_3.0_1727294738490.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_subjective_amazon","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_subjective_amazon", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_subjective_amazon| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/MTOrange/bert-subjective-amazon \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_subjective_amazon_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_subjective_amazon_pipeline_en.md new file mode 100644 index 00000000000000..896a93ea40f74f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_subjective_amazon_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_subjective_amazon_pipeline pipeline BertForSequenceClassification from MTOrange +author: John Snow Labs +name: bert_subjective_amazon_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_subjective_amazon_pipeline` is a English model originally trained by MTOrange. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_subjective_amazon_pipeline_en_5.5.0_3.0_1727294759872.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_subjective_amazon_pipeline_en_5.5.0_3.0_1727294759872.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_subjective_amazon_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_subjective_amazon_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_subjective_amazon_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/MTOrange/bert-subjective-amazon + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_turkish_text_classification_pipeline_tr.md b/docs/_posts/ahmedlone127/2024-09-25-bert_turkish_text_classification_pipeline_tr.md new file mode 100644 index 00000000000000..4c19c070711b49 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_turkish_text_classification_pipeline_tr.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Turkish bert_turkish_text_classification_pipeline pipeline BertForSequenceClassification from Marzu39 +author: John Snow Labs +name: bert_turkish_text_classification_pipeline +date: 2024-09-25 +tags: [tr, open_source, pipeline, onnx] +task: Text Classification +language: tr +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_turkish_text_classification_pipeline` is a Turkish model originally trained by Marzu39. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_turkish_text_classification_pipeline_tr_5.5.0_3.0_1727308218490.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_turkish_text_classification_pipeline_tr_5.5.0_3.0_1727308218490.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_turkish_text_classification_pipeline", lang = "tr") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_turkish_text_classification_pipeline", lang = "tr") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_turkish_text_classification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|tr| +|Size:|691.2 MB| + +## References + +https://huggingface.co/Marzu39/bert-turkish-text-classification + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_turkish_text_classification_tr.md b/docs/_posts/ahmedlone127/2024-09-25-bert_turkish_text_classification_tr.md new file mode 100644 index 00000000000000..fe719c53d19ec7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_turkish_text_classification_tr.md @@ -0,0 +1,98 @@ +--- +layout: model +title: Turkish bert_turkish_text_classification BertForSequenceClassification from akdeniz27 +author: John Snow Labs +name: bert_turkish_text_classification +date: 2024-09-25 +tags: [bert, tr, open_source, sequence_classification, onnx] +task: Text Classification +language: tr +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_turkish_text_classification` is a Turkish model originally trained by akdeniz27. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_turkish_text_classification_tr_5.5.0_3.0_1727308182196.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_turkish_text_classification_tr_5.5.0_3.0_1727308182196.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_turkish_text_classification","tr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_turkish_text_classification","tr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_turkish_text_classification| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|tr| +|Size:|691.2 MB| + +## References + +References + +https://huggingface.co/akdeniz27/bert-turkish-text-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bert_twitterfin_padding0model_en.md b/docs/_posts/ahmedlone127/2024-09-25-bert_twitterfin_padding0model_en.md new file mode 100644 index 00000000000000..ec7ed6b30c4c28 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bert_twitterfin_padding0model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_twitterfin_padding0model BertForSequenceClassification from Realgon +author: John Snow Labs +name: bert_twitterfin_padding0model +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_twitterfin_padding0model` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_twitterfin_padding0model_en_5.5.0_3.0_1727285742949.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_twitterfin_padding0model_en_5.5.0_3.0_1727285742949.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_twitterfin_padding0model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_twitterfin_padding0model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_twitterfin_padding0model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Realgon/bert_twitterfin_padding0model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bertdisinfdetect_en.md b/docs/_posts/ahmedlone127/2024-09-25-bertdisinfdetect_en.md new file mode 100644 index 00000000000000..32fad0f89d39db --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bertdisinfdetect_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bertdisinfdetect BertForSequenceClassification from ananya122 +author: John Snow Labs +name: bertdisinfdetect +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bertdisinfdetect` is a English model originally trained by ananya122. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bertdisinfdetect_en_5.5.0_3.0_1727290777193.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bertdisinfdetect_en_5.5.0_3.0_1727290777193.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bertdisinfdetect","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bertdisinfdetect", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bertdisinfdetect| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ananya122/bertdisinfdetect \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bertdisinfdetect_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bertdisinfdetect_pipeline_en.md new file mode 100644 index 00000000000000..df18327a3f2eb3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bertdisinfdetect_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bertdisinfdetect_pipeline pipeline BertForSequenceClassification from ananya122 +author: John Snow Labs +name: bertdisinfdetect_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bertdisinfdetect_pipeline` is a English model originally trained by ananya122. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bertdisinfdetect_pipeline_en_5.5.0_3.0_1727290799383.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bertdisinfdetect_pipeline_en_5.5.0_3.0_1727290799383.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bertdisinfdetect_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bertdisinfdetect_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bertdisinfdetect_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ananya122/bertdisinfdetect + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-best_model_sst_2_16_13_en.md b/docs/_posts/ahmedlone127/2024-09-25-best_model_sst_2_16_13_en.md new file mode 100644 index 00000000000000..ea1bea2955fb81 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-best_model_sst_2_16_13_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English best_model_sst_2_16_13 BertForSequenceClassification from simonycl +author: John Snow Labs +name: best_model_sst_2_16_13 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`best_model_sst_2_16_13` is a English model originally trained by simonycl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/best_model_sst_2_16_13_en_5.5.0_3.0_1727289303733.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/best_model_sst_2_16_13_en_5.5.0_3.0_1727289303733.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("best_model_sst_2_16_13","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("best_model_sst_2_16_13", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|best_model_sst_2_16_13| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/simonycl/best_model-sst-2-16-13 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-best_model_sst_2_16_13_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-best_model_sst_2_16_13_pipeline_en.md new file mode 100644 index 00000000000000..d2c3f0696c29f3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-best_model_sst_2_16_13_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English best_model_sst_2_16_13_pipeline pipeline BertForSequenceClassification from simonycl +author: John Snow Labs +name: best_model_sst_2_16_13_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`best_model_sst_2_16_13_pipeline` is a English model originally trained by simonycl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/best_model_sst_2_16_13_pipeline_en_5.5.0_3.0_1727289334926.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/best_model_sst_2_16_13_pipeline_en_5.5.0_3.0_1727289334926.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("best_model_sst_2_16_13_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("best_model_sst_2_16_13_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|best_model_sst_2_16_13_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/simonycl/best_model-sst-2-16-13 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-beto_sentiment_analysis_finetuned_onpremise_en.md b/docs/_posts/ahmedlone127/2024-09-25-beto_sentiment_analysis_finetuned_onpremise_en.md new file mode 100644 index 00000000000000..af8f0d580d4e0e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-beto_sentiment_analysis_finetuned_onpremise_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English beto_sentiment_analysis_finetuned_onpremise BertForSequenceClassification from Cristian-dcg +author: John Snow Labs +name: beto_sentiment_analysis_finetuned_onpremise +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`beto_sentiment_analysis_finetuned_onpremise` is a English model originally trained by Cristian-dcg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/beto_sentiment_analysis_finetuned_onpremise_en_5.5.0_3.0_1727263924878.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/beto_sentiment_analysis_finetuned_onpremise_en_5.5.0_3.0_1727263924878.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("beto_sentiment_analysis_finetuned_onpremise","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("beto_sentiment_analysis_finetuned_onpremise", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|beto_sentiment_analysis_finetuned_onpremise| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|411.7 MB| + +## References + +https://huggingface.co/Cristian-dcg/beto-sentiment-analysis-finetuned-onpremise \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-biomednlp_pubmedbert_proteinstructure_ner_v1_2_en.md b/docs/_posts/ahmedlone127/2024-09-25-biomednlp_pubmedbert_proteinstructure_ner_v1_2_en.md new file mode 100644 index 00000000000000..ddde74c0c9bf6d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-biomednlp_pubmedbert_proteinstructure_ner_v1_2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English biomednlp_pubmedbert_proteinstructure_ner_v1_2 BertForTokenClassification from PDBEurope +author: John Snow Labs +name: biomednlp_pubmedbert_proteinstructure_ner_v1_2 +date: 2024-09-25 +tags: [en, open_source, onnx, token_classification, bert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biomednlp_pubmedbert_proteinstructure_ner_v1_2` is a English model originally trained by PDBEurope. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biomednlp_pubmedbert_proteinstructure_ner_v1_2_en_5.5.0_3.0_1727275537296.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biomednlp_pubmedbert_proteinstructure_ner_v1_2_en_5.5.0_3.0_1727275537296.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = BertForTokenClassification.pretrained("biomednlp_pubmedbert_proteinstructure_ner_v1_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("biomednlp_pubmedbert_proteinstructure_ner_v1_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biomednlp_pubmedbert_proteinstructure_ner_v1_2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.3 MB| + +## References + +https://huggingface.co/PDBEurope/BiomedNLP-PubMedBERT-ProteinStructure-NER-v1.2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-boss_sentiment_1500_bert_base_uncased_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-boss_sentiment_1500_bert_base_uncased_pipeline_en.md new file mode 100644 index 00000000000000..537c141afbeaa6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-boss_sentiment_1500_bert_base_uncased_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English boss_sentiment_1500_bert_base_uncased_pipeline pipeline BertForSequenceClassification from Kyle1668 +author: John Snow Labs +name: boss_sentiment_1500_bert_base_uncased_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`boss_sentiment_1500_bert_base_uncased_pipeline` is a English model originally trained by Kyle1668. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/boss_sentiment_1500_bert_base_uncased_pipeline_en_5.5.0_3.0_1727267529999.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/boss_sentiment_1500_bert_base_uncased_pipeline_en_5.5.0_3.0_1727267529999.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("boss_sentiment_1500_bert_base_uncased_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("boss_sentiment_1500_bert_base_uncased_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|boss_sentiment_1500_bert_base_uncased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Kyle1668/boss-sentiment-1500-bert-base-uncased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-boss_toxicity_48000_bert_base_uncased_en.md b/docs/_posts/ahmedlone127/2024-09-25-boss_toxicity_48000_bert_base_uncased_en.md new file mode 100644 index 00000000000000..a10308bd8dea65 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-boss_toxicity_48000_bert_base_uncased_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English boss_toxicity_48000_bert_base_uncased BertForSequenceClassification from Kyle1668 +author: John Snow Labs +name: boss_toxicity_48000_bert_base_uncased +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`boss_toxicity_48000_bert_base_uncased` is a English model originally trained by Kyle1668. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/boss_toxicity_48000_bert_base_uncased_en_5.5.0_3.0_1727301981520.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/boss_toxicity_48000_bert_base_uncased_en_5.5.0_3.0_1727301981520.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("boss_toxicity_48000_bert_base_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("boss_toxicity_48000_bert_base_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|boss_toxicity_48000_bert_base_uncased| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Kyle1668/boss-toxicity-48000-bert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-boss_toxicity_48000_bert_base_uncased_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-boss_toxicity_48000_bert_base_uncased_pipeline_en.md new file mode 100644 index 00000000000000..3555ef066b9ac4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-boss_toxicity_48000_bert_base_uncased_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English boss_toxicity_48000_bert_base_uncased_pipeline pipeline BertForSequenceClassification from Kyle1668 +author: John Snow Labs +name: boss_toxicity_48000_bert_base_uncased_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`boss_toxicity_48000_bert_base_uncased_pipeline` is a English model originally trained by Kyle1668. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/boss_toxicity_48000_bert_base_uncased_pipeline_en_5.5.0_3.0_1727302003503.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/boss_toxicity_48000_bert_base_uncased_pipeline_en_5.5.0_3.0_1727302003503.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("boss_toxicity_48000_bert_base_uncased_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("boss_toxicity_48000_bert_base_uncased_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|boss_toxicity_48000_bert_base_uncased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Kyle1668/boss-toxicity-48000-bert-base-uncased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-bp_int_i04_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-bp_int_i04_pipeline_en.md new file mode 100644 index 00000000000000..6d494a8f959b29 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-bp_int_i04_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bp_int_i04_pipeline pipeline BertForSequenceClassification from Anwaarma +author: John Snow Labs +name: bp_int_i04_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bp_int_i04_pipeline` is a English model originally trained by Anwaarma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bp_int_i04_pipeline_en_5.5.0_3.0_1727268839280.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bp_int_i04_pipeline_en_5.5.0_3.0_1727268839280.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bp_int_i04_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bp_int_i04_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bp_int_i04_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|16.8 MB| + +## References + +https://huggingface.co/Anwaarma/BP-INT-I04 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-burmese_awesome_model_mishig_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-burmese_awesome_model_mishig_pipeline_en.md new file mode 100644 index 00000000000000..27cf19906a3403 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-burmese_awesome_model_mishig_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English burmese_awesome_model_mishig_pipeline pipeline BertForSequenceClassification from mishig +author: John Snow Labs +name: burmese_awesome_model_mishig_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_model_mishig_pipeline` is a English model originally trained by mishig. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_mishig_pipeline_en_5.5.0_3.0_1727305828308.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_mishig_pipeline_en_5.5.0_3.0_1727305828308.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("burmese_awesome_model_mishig_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("burmese_awesome_model_mishig_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_model_mishig_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/mishig/my-awesome-model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-burmese_awesome_pubmed_bert_en.md b/docs/_posts/ahmedlone127/2024-09-25-burmese_awesome_pubmed_bert_en.md new file mode 100644 index 00000000000000..c1e3fed632bc60 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-burmese_awesome_pubmed_bert_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English burmese_awesome_pubmed_bert BertForTokenClassification from arunavsk1 +author: John Snow Labs +name: burmese_awesome_pubmed_bert +date: 2024-09-25 +tags: [en, open_source, onnx, token_classification, bert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_pubmed_bert` is a English model originally trained by arunavsk1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_pubmed_bert_en_5.5.0_3.0_1727247446957.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_pubmed_bert_en_5.5.0_3.0_1727247446957.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = BertForTokenClassification.pretrained("burmese_awesome_pubmed_bert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("burmese_awesome_pubmed_bert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_pubmed_bert| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/arunavsk1/my-awesome-pubmed-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-chinese_roberta_climate_related_prediction_v2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-chinese_roberta_climate_related_prediction_v2_pipeline_en.md new file mode 100644 index 00000000000000..c3c1d4f00e0108 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-chinese_roberta_climate_related_prediction_v2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English chinese_roberta_climate_related_prediction_v2_pipeline pipeline BertForSequenceClassification from hw2942 +author: John Snow Labs +name: chinese_roberta_climate_related_prediction_v2_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chinese_roberta_climate_related_prediction_v2_pipeline` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chinese_roberta_climate_related_prediction_v2_pipeline_en_5.5.0_3.0_1727306209527.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chinese_roberta_climate_related_prediction_v2_pipeline_en_5.5.0_3.0_1727306209527.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("chinese_roberta_climate_related_prediction_v2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("chinese_roberta_climate_related_prediction_v2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chinese_roberta_climate_related_prediction_v2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|383.2 MB| + +## References + +https://huggingface.co/hw2942/chinese-roberta-climate-related-prediction-v2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-cola_6ep_ft_47_en.md b/docs/_posts/ahmedlone127/2024-09-25-cola_6ep_ft_47_en.md new file mode 100644 index 00000000000000..68985396c97bdd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-cola_6ep_ft_47_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English cola_6ep_ft_47 BertForSequenceClassification from connectivity +author: John Snow Labs +name: cola_6ep_ft_47 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cola_6ep_ft_47` is a English model originally trained by connectivity. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cola_6ep_ft_47_en_5.5.0_3.0_1727289461341.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cola_6ep_ft_47_en_5.5.0_3.0_1727289461341.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("cola_6ep_ft_47","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("cola_6ep_ft_47", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cola_6ep_ft_47| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/connectivity/cola_6ep_ft-47 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-cola_6ep_ft_47_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-cola_6ep_ft_47_pipeline_en.md new file mode 100644 index 00000000000000..bfe9c2649887a4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-cola_6ep_ft_47_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English cola_6ep_ft_47_pipeline pipeline BertForSequenceClassification from connectivity +author: John Snow Labs +name: cola_6ep_ft_47_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cola_6ep_ft_47_pipeline` is a English model originally trained by connectivity. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cola_6ep_ft_47_pipeline_en_5.5.0_3.0_1727289482735.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cola_6ep_ft_47_pipeline_en_5.5.0_3.0_1727289482735.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("cola_6ep_ft_47_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("cola_6ep_ft_47_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cola_6ep_ft_47_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/connectivity/cola_6ep_ft-47 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-cold_fusion_bert_base_uncased_itr14_seed0_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-cold_fusion_bert_base_uncased_itr14_seed0_pipeline_en.md new file mode 100644 index 00000000000000..ad1691ae8319ac --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-cold_fusion_bert_base_uncased_itr14_seed0_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English cold_fusion_bert_base_uncased_itr14_seed0_pipeline pipeline BertForSequenceClassification from ibm +author: John Snow Labs +name: cold_fusion_bert_base_uncased_itr14_seed0_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cold_fusion_bert_base_uncased_itr14_seed0_pipeline` is a English model originally trained by ibm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr14_seed0_pipeline_en_5.5.0_3.0_1727289634015.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr14_seed0_pipeline_en_5.5.0_3.0_1727289634015.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("cold_fusion_bert_base_uncased_itr14_seed0_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("cold_fusion_bert_base_uncased_itr14_seed0_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cold_fusion_bert_base_uncased_itr14_seed0_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.3 MB| + +## References + +https://huggingface.co/ibm/ColD-Fusion-bert-base-uncased-itr14-seed0 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-cold_fusion_bert_base_uncased_itr20_seed0_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-cold_fusion_bert_base_uncased_itr20_seed0_pipeline_en.md new file mode 100644 index 00000000000000..97bef45a6e9d16 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-cold_fusion_bert_base_uncased_itr20_seed0_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English cold_fusion_bert_base_uncased_itr20_seed0_pipeline pipeline BertForSequenceClassification from ibm +author: John Snow Labs +name: cold_fusion_bert_base_uncased_itr20_seed0_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cold_fusion_bert_base_uncased_itr20_seed0_pipeline` is a English model originally trained by ibm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr20_seed0_pipeline_en_5.5.0_3.0_1727289833005.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr20_seed0_pipeline_en_5.5.0_3.0_1727289833005.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("cold_fusion_bert_base_uncased_itr20_seed0_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("cold_fusion_bert_base_uncased_itr20_seed0_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cold_fusion_bert_base_uncased_itr20_seed0_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.3 MB| + +## References + +https://huggingface.co/ibm/ColD-Fusion-bert-base-uncased-itr20-seed0 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-cold_fusion_bert_base_uncased_itr23_seed0_finetuned_effectiveness_dagstuhl_en.md b/docs/_posts/ahmedlone127/2024-09-25-cold_fusion_bert_base_uncased_itr23_seed0_finetuned_effectiveness_dagstuhl_en.md new file mode 100644 index 00000000000000..dc33b92e614201 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-cold_fusion_bert_base_uncased_itr23_seed0_finetuned_effectiveness_dagstuhl_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English cold_fusion_bert_base_uncased_itr23_seed0_finetuned_effectiveness_dagstuhl BertForSequenceClassification from jakub014 +author: John Snow Labs +name: cold_fusion_bert_base_uncased_itr23_seed0_finetuned_effectiveness_dagstuhl +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cold_fusion_bert_base_uncased_itr23_seed0_finetuned_effectiveness_dagstuhl` is a English model originally trained by jakub014. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr23_seed0_finetuned_effectiveness_dagstuhl_en_5.5.0_3.0_1727291286031.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr23_seed0_finetuned_effectiveness_dagstuhl_en_5.5.0_3.0_1727291286031.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("cold_fusion_bert_base_uncased_itr23_seed0_finetuned_effectiveness_dagstuhl","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("cold_fusion_bert_base_uncased_itr23_seed0_finetuned_effectiveness_dagstuhl", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cold_fusion_bert_base_uncased_itr23_seed0_finetuned_effectiveness_dagstuhl| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jakub014/ColD-Fusion-bert-base-uncased-itr23-seed0-finetuned-effectiveness-dagstuhl \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-cold_fusion_bert_base_uncased_itr23_seed0_finetuned_effectiveness_dagstuhl_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-cold_fusion_bert_base_uncased_itr23_seed0_finetuned_effectiveness_dagstuhl_pipeline_en.md new file mode 100644 index 00000000000000..6b986e8aeac080 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-cold_fusion_bert_base_uncased_itr23_seed0_finetuned_effectiveness_dagstuhl_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English cold_fusion_bert_base_uncased_itr23_seed0_finetuned_effectiveness_dagstuhl_pipeline pipeline BertForSequenceClassification from jakub014 +author: John Snow Labs +name: cold_fusion_bert_base_uncased_itr23_seed0_finetuned_effectiveness_dagstuhl_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cold_fusion_bert_base_uncased_itr23_seed0_finetuned_effectiveness_dagstuhl_pipeline` is a English model originally trained by jakub014. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr23_seed0_finetuned_effectiveness_dagstuhl_pipeline_en_5.5.0_3.0_1727291307499.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr23_seed0_finetuned_effectiveness_dagstuhl_pipeline_en_5.5.0_3.0_1727291307499.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("cold_fusion_bert_base_uncased_itr23_seed0_finetuned_effectiveness_dagstuhl_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("cold_fusion_bert_base_uncased_itr23_seed0_finetuned_effectiveness_dagstuhl_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cold_fusion_bert_base_uncased_itr23_seed0_finetuned_effectiveness_dagstuhl_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jakub014/ColD-Fusion-bert-base-uncased-itr23-seed0-finetuned-effectiveness-dagstuhl + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-conflibert_satp_relevant_multilabel_en.md b/docs/_posts/ahmedlone127/2024-09-25-conflibert_satp_relevant_multilabel_en.md new file mode 100644 index 00000000000000..cf96bc237923cf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-conflibert_satp_relevant_multilabel_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English conflibert_satp_relevant_multilabel BertForSequenceClassification from eventdata-utd +author: John Snow Labs +name: conflibert_satp_relevant_multilabel +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`conflibert_satp_relevant_multilabel` is a English model originally trained by eventdata-utd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/conflibert_satp_relevant_multilabel_en_5.5.0_3.0_1727303279023.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/conflibert_satp_relevant_multilabel_en_5.5.0_3.0_1727303279023.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("conflibert_satp_relevant_multilabel","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("conflibert_satp_relevant_multilabel", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|conflibert_satp_relevant_multilabel| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/eventdata-utd/conflibert-satp-relevant-multilabel \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-core_clinical_mortality_prediction_en.md b/docs/_posts/ahmedlone127/2024-09-25-core_clinical_mortality_prediction_en.md new file mode 100644 index 00000000000000..1dc323dcdcf5af --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-core_clinical_mortality_prediction_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English core_clinical_mortality_prediction BertForSequenceClassification from DATEXIS +author: John Snow Labs +name: core_clinical_mortality_prediction +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`core_clinical_mortality_prediction` is a English model originally trained by DATEXIS. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/core_clinical_mortality_prediction_en_5.5.0_3.0_1727307100245.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/core_clinical_mortality_prediction_en_5.5.0_3.0_1727307100245.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("core_clinical_mortality_prediction","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("core_clinical_mortality_prediction", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|core_clinical_mortality_prediction| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.3 MB| + +## References + +https://huggingface.co/DATEXIS/CORe-clinical-mortality-prediction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-covid_vaccine_sentiment_analysis_bert_based_model_en.md b/docs/_posts/ahmedlone127/2024-09-25-covid_vaccine_sentiment_analysis_bert_based_model_en.md new file mode 100644 index 00000000000000..742378c166d6f9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-covid_vaccine_sentiment_analysis_bert_based_model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English covid_vaccine_sentiment_analysis_bert_based_model BertForSequenceClassification from NewtonKimathi +author: John Snow Labs +name: covid_vaccine_sentiment_analysis_bert_based_model +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`covid_vaccine_sentiment_analysis_bert_based_model` is a English model originally trained by NewtonKimathi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/covid_vaccine_sentiment_analysis_bert_based_model_en_5.5.0_3.0_1727295041134.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/covid_vaccine_sentiment_analysis_bert_based_model_en_5.5.0_3.0_1727295041134.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("covid_vaccine_sentiment_analysis_bert_based_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("covid_vaccine_sentiment_analysis_bert_based_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|covid_vaccine_sentiment_analysis_bert_based_model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/NewtonKimathi/Covid_Vaccine_Sentiment_Analysis_Bert_based_Model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-covid_vaccine_sentiment_analysis_bert_based_model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-covid_vaccine_sentiment_analysis_bert_based_model_pipeline_en.md new file mode 100644 index 00000000000000..2740e17763f78b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-covid_vaccine_sentiment_analysis_bert_based_model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English covid_vaccine_sentiment_analysis_bert_based_model_pipeline pipeline BertForSequenceClassification from NewtonKimathi +author: John Snow Labs +name: covid_vaccine_sentiment_analysis_bert_based_model_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`covid_vaccine_sentiment_analysis_bert_based_model_pipeline` is a English model originally trained by NewtonKimathi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/covid_vaccine_sentiment_analysis_bert_based_model_pipeline_en_5.5.0_3.0_1727295063175.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/covid_vaccine_sentiment_analysis_bert_based_model_pipeline_en_5.5.0_3.0_1727295063175.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("covid_vaccine_sentiment_analysis_bert_based_model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("covid_vaccine_sentiment_analysis_bert_based_model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|covid_vaccine_sentiment_analysis_bert_based_model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/NewtonKimathi/Covid_Vaccine_Sentiment_Analysis_Bert_based_Model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-cross_encoder_roberta_wwm_ext_v0_pipeline_zh.md b/docs/_posts/ahmedlone127/2024-09-25-cross_encoder_roberta_wwm_ext_v0_pipeline_zh.md new file mode 100644 index 00000000000000..50944a2c380af0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-cross_encoder_roberta_wwm_ext_v0_pipeline_zh.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Chinese cross_encoder_roberta_wwm_ext_v0_pipeline pipeline BertForSequenceClassification from tuhailong +author: John Snow Labs +name: cross_encoder_roberta_wwm_ext_v0_pipeline +date: 2024-09-25 +tags: [zh, open_source, pipeline, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cross_encoder_roberta_wwm_ext_v0_pipeline` is a Chinese model originally trained by tuhailong. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cross_encoder_roberta_wwm_ext_v0_pipeline_zh_5.5.0_3.0_1727293200144.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cross_encoder_roberta_wwm_ext_v0_pipeline_zh_5.5.0_3.0_1727293200144.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("cross_encoder_roberta_wwm_ext_v0_pipeline", lang = "zh") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("cross_encoder_roberta_wwm_ext_v0_pipeline", lang = "zh") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cross_encoder_roberta_wwm_ext_v0_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|zh| +|Size:|383.2 MB| + +## References + +https://huggingface.co/tuhailong/cross_encoder_roberta-wwm-ext_v0 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-darija_sentiment_analysis_en.md b/docs/_posts/ahmedlone127/2024-09-25-darija_sentiment_analysis_en.md new file mode 100644 index 00000000000000..88c98c80b8e609 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-darija_sentiment_analysis_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English darija_sentiment_analysis BertForSequenceClassification from ychafiqui +author: John Snow Labs +name: darija_sentiment_analysis +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`darija_sentiment_analysis` is a English model originally trained by ychafiqui. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/darija_sentiment_analysis_en_5.5.0_3.0_1727301403704.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/darija_sentiment_analysis_en_5.5.0_3.0_1727301403704.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("darija_sentiment_analysis","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("darija_sentiment_analysis", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|darija_sentiment_analysis| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|553.7 MB| + +## References + +https://huggingface.co/ychafiqui/darija_sentiment_analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-darija_sentiment_analysis_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-darija_sentiment_analysis_pipeline_en.md new file mode 100644 index 00000000000000..db7e341ad53e4f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-darija_sentiment_analysis_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English darija_sentiment_analysis_pipeline pipeline BertForSequenceClassification from ychafiqui +author: John Snow Labs +name: darija_sentiment_analysis_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`darija_sentiment_analysis_pipeline` is a English model originally trained by ychafiqui. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/darija_sentiment_analysis_pipeline_en_5.5.0_3.0_1727301433561.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/darija_sentiment_analysis_pipeline_en_5.5.0_3.0_1727301433561.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("darija_sentiment_analysis_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("darija_sentiment_analysis_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|darija_sentiment_analysis_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|553.7 MB| + +## References + +https://huggingface.co/ychafiqui/darija_sentiment_analysis + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-dbmdz_bertturk_sentiment_groundtruthv5_en.md b/docs/_posts/ahmedlone127/2024-09-25-dbmdz_bertturk_sentiment_groundtruthv5_en.md new file mode 100644 index 00000000000000..00e6d0aeab1810 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-dbmdz_bertturk_sentiment_groundtruthv5_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English dbmdz_bertturk_sentiment_groundtruthv5 BertForSequenceClassification from Apoksk1 +author: John Snow Labs +name: dbmdz_bertturk_sentiment_groundtruthv5 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dbmdz_bertturk_sentiment_groundtruthv5` is a English model originally trained by Apoksk1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dbmdz_bertturk_sentiment_groundtruthv5_en_5.5.0_3.0_1727292847834.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dbmdz_bertturk_sentiment_groundtruthv5_en_5.5.0_3.0_1727292847834.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("dbmdz_bertturk_sentiment_groundtruthv5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("dbmdz_bertturk_sentiment_groundtruthv5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dbmdz_bertturk_sentiment_groundtruthv5| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.5 MB| + +## References + +https://huggingface.co/Apoksk1/dbmdz-bertTurk-sentiment-GroundTruthV5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-dbmdz_bertturk_sentiment_groundtruthv5_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-dbmdz_bertturk_sentiment_groundtruthv5_pipeline_en.md new file mode 100644 index 00000000000000..25549ebf74d28b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-dbmdz_bertturk_sentiment_groundtruthv5_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English dbmdz_bertturk_sentiment_groundtruthv5_pipeline pipeline BertForSequenceClassification from Apoksk1 +author: John Snow Labs +name: dbmdz_bertturk_sentiment_groundtruthv5_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dbmdz_bertturk_sentiment_groundtruthv5_pipeline` is a English model originally trained by Apoksk1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dbmdz_bertturk_sentiment_groundtruthv5_pipeline_en_5.5.0_3.0_1727292869366.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dbmdz_bertturk_sentiment_groundtruthv5_pipeline_en_5.5.0_3.0_1727292869366.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("dbmdz_bertturk_sentiment_groundtruthv5_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("dbmdz_bertturk_sentiment_groundtruthv5_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dbmdz_bertturk_sentiment_groundtruthv5_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|414.5 MB| + +## References + +https://huggingface.co/Apoksk1/dbmdz-bertTurk-sentiment-GroundTruthV5 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-dbpedia_classes_bert_base_uncased_few_10_f_en.md b/docs/_posts/ahmedlone127/2024-09-25-dbpedia_classes_bert_base_uncased_few_10_f_en.md new file mode 100644 index 00000000000000..82512a432c3e22 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-dbpedia_classes_bert_base_uncased_few_10_f_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English dbpedia_classes_bert_base_uncased_few_10_f BertForSequenceClassification from TheChickenAgent +author: John Snow Labs +name: dbpedia_classes_bert_base_uncased_few_10_f +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dbpedia_classes_bert_base_uncased_few_10_f` is a English model originally trained by TheChickenAgent. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dbpedia_classes_bert_base_uncased_few_10_f_en_5.5.0_3.0_1727288691206.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dbpedia_classes_bert_base_uncased_few_10_f_en_5.5.0_3.0_1727288691206.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("dbpedia_classes_bert_base_uncased_few_10_f","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("dbpedia_classes_bert_base_uncased_few_10_f", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dbpedia_classes_bert_base_uncased_few_10_f| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/TheChickenAgent/DBPedia_Classes_BERT-base-uncased-few-10-F \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-detoxify_toxic_english_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-detoxify_toxic_english_pipeline_en.md new file mode 100644 index 00000000000000..549778dd4ba6c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-detoxify_toxic_english_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English detoxify_toxic_english_pipeline pipeline BertForSequenceClassification from joshuahm +author: John Snow Labs +name: detoxify_toxic_english_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`detoxify_toxic_english_pipeline` is a English model originally trained by joshuahm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/detoxify_toxic_english_pipeline_en_5.5.0_3.0_1727290854010.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/detoxify_toxic_english_pipeline_en_5.5.0_3.0_1727290854010.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("detoxify_toxic_english_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("detoxify_toxic_english_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|detoxify_toxic_english_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|412.2 MB| + +## References + +https://huggingface.co/joshuahm/detoxify_toxic_en + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-dialogue2_en.md b/docs/_posts/ahmedlone127/2024-09-25-dialogue2_en.md new file mode 100644 index 00000000000000..e633c2365e6678 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-dialogue2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English dialogue2 BertForSequenceClassification from SharonTudi +author: John Snow Labs +name: dialogue2 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dialogue2` is a English model originally trained by SharonTudi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dialogue2_en_5.5.0_3.0_1727289831643.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dialogue2_en_5.5.0_3.0_1727289831643.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("dialogue2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("dialogue2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dialogue2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/SharonTudi/DIALOGUE2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-dialogue_final_model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-dialogue_final_model_pipeline_en.md new file mode 100644 index 00000000000000..04fca35cd7888d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-dialogue_final_model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English dialogue_final_model_pipeline pipeline BertForSequenceClassification from SharonTudi +author: John Snow Labs +name: dialogue_final_model_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dialogue_final_model_pipeline` is a English model originally trained by SharonTudi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dialogue_final_model_pipeline_en_5.5.0_3.0_1727288776230.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dialogue_final_model_pipeline_en_5.5.0_3.0_1727288776230.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("dialogue_final_model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("dialogue_final_model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dialogue_final_model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/SharonTudi/DIALOGUE_final_model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-discord_crypto_scam_detector_en.md b/docs/_posts/ahmedlone127/2024-09-25-discord_crypto_scam_detector_en.md new file mode 100644 index 00000000000000..d3b57f2000e95f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-discord_crypto_scam_detector_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English discord_crypto_scam_detector BertForSequenceClassification from rzeydelis +author: John Snow Labs +name: discord_crypto_scam_detector +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`discord_crypto_scam_detector` is a English model originally trained by rzeydelis. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/discord_crypto_scam_detector_en_5.5.0_3.0_1727291040035.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/discord_crypto_scam_detector_en_5.5.0_3.0_1727291040035.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("discord_crypto_scam_detector","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("discord_crypto_scam_detector", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|discord_crypto_scam_detector| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/rzeydelis/discord-crypto-scam-detector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-discord_crypto_scam_detector_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-discord_crypto_scam_detector_pipeline_en.md new file mode 100644 index 00000000000000..408d877b9d17e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-discord_crypto_scam_detector_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English discord_crypto_scam_detector_pipeline pipeline BertForSequenceClassification from rzeydelis +author: John Snow Labs +name: discord_crypto_scam_detector_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`discord_crypto_scam_detector_pipeline` is a English model originally trained by rzeydelis. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/discord_crypto_scam_detector_pipeline_en_5.5.0_3.0_1727291061215.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/discord_crypto_scam_detector_pipeline_en_5.5.0_3.0_1727291061215.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("discord_crypto_scam_detector_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("discord_crypto_scam_detector_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|discord_crypto_scam_detector_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/rzeydelis/discord-crypto-scam-detector + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-distilbert_base_uncased_accelerate_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-distilbert_base_uncased_accelerate_pipeline_en.md new file mode 100644 index 00000000000000..babdc601a0a11b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-distilbert_base_uncased_accelerate_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_accelerate_pipeline pipeline BertForTokenClassification from NSandra +author: John Snow Labs +name: distilbert_base_uncased_accelerate_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_accelerate_pipeline` is a English model originally trained by NSandra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_accelerate_pipeline_en_5.5.0_3.0_1727283642169.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_accelerate_pipeline_en_5.5.0_3.0_1727283642169.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_accelerate_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_accelerate_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_accelerate_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/NSandra/distilbert-base-uncased-accelerate + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-distilbert_base_uncased_kunalwoebot_en.md b/docs/_posts/ahmedlone127/2024-09-25-distilbert_base_uncased_kunalwoebot_en.md new file mode 100644 index 00000000000000..57ff90d219b40b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-distilbert_base_uncased_kunalwoebot_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_kunalwoebot BertForSequenceClassification from kunalwoebot +author: John Snow Labs +name: distilbert_base_uncased_kunalwoebot +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_kunalwoebot` is a English model originally trained by kunalwoebot. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_kunalwoebot_en_5.5.0_3.0_1727298672168.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_kunalwoebot_en_5.5.0_3.0_1727298672168.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("distilbert_base_uncased_kunalwoebot","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("distilbert_base_uncased_kunalwoebot", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_kunalwoebot| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/kunalwoebot/distilbert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-distilbert_base_uncased_kunalwoebot_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-distilbert_base_uncased_kunalwoebot_pipeline_en.md new file mode 100644 index 00000000000000..fb085accf0264c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-distilbert_base_uncased_kunalwoebot_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_kunalwoebot_pipeline pipeline BertForSequenceClassification from kunalwoebot +author: John Snow Labs +name: distilbert_base_uncased_kunalwoebot_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_kunalwoebot_pipeline` is a English model originally trained by kunalwoebot. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_kunalwoebot_pipeline_en_5.5.0_3.0_1727298694852.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_kunalwoebot_pipeline_en_5.5.0_3.0_1727298694852.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_kunalwoebot_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_kunalwoebot_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_kunalwoebot_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/kunalwoebot/distilbert-base-uncased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-dnabert_500down_en.md b/docs/_posts/ahmedlone127/2024-09-25-dnabert_500down_en.md new file mode 100644 index 00000000000000..cc5d0ec8762ee2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-dnabert_500down_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English dnabert_500down BertForSequenceClassification from AidenH20 +author: John Snow Labs +name: dnabert_500down +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dnabert_500down` is a English model originally trained by AidenH20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dnabert_500down_en_5.5.0_3.0_1727300811550.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dnabert_500down_en_5.5.0_3.0_1727300811550.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("dnabert_500down","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("dnabert_500down", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dnabert_500down| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|334.3 MB| + +## References + +https://huggingface.co/AidenH20/DNABERT-500down \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-dnabert_500down_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-dnabert_500down_pipeline_en.md new file mode 100644 index 00000000000000..1d56c32dc0f636 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-dnabert_500down_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English dnabert_500down_pipeline pipeline BertForSequenceClassification from AidenH20 +author: John Snow Labs +name: dnabert_500down_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dnabert_500down_pipeline` is a English model originally trained by AidenH20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dnabert_500down_pipeline_en_5.5.0_3.0_1727300829744.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dnabert_500down_pipeline_en_5.5.0_3.0_1727300829744.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("dnabert_500down_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("dnabert_500down_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dnabert_500down_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|334.3 MB| + +## References + +https://huggingface.co/AidenH20/DNABERT-500down + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-e2_test05_en.md b/docs/_posts/ahmedlone127/2024-09-25-e2_test05_en.md new file mode 100644 index 00000000000000..9fe8fa3d3d8655 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-e2_test05_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English e2_test05 BertForSequenceClassification from SamagraDataGov +author: John Snow Labs +name: e2_test05 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`e2_test05` is a English model originally trained by SamagraDataGov. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/e2_test05_en_5.5.0_3.0_1727307861605.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/e2_test05_en_5.5.0_3.0_1727307861605.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("e2_test05","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("e2_test05", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|e2_test05| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/SamagraDataGov/e2_test05 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-email_spam_classification_merged_legacy107_en.md b/docs/_posts/ahmedlone127/2024-09-25-email_spam_classification_merged_legacy107_en.md new file mode 100644 index 00000000000000..9fa3be50d9e983 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-email_spam_classification_merged_legacy107_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English email_spam_classification_merged_legacy107 BertForSequenceClassification from legacy107 +author: John Snow Labs +name: email_spam_classification_merged_legacy107 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`email_spam_classification_merged_legacy107` is a English model originally trained by legacy107. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/email_spam_classification_merged_legacy107_en_5.5.0_3.0_1727305892303.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/email_spam_classification_merged_legacy107_en_5.5.0_3.0_1727305892303.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("email_spam_classification_merged_legacy107","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("email_spam_classification_merged_legacy107", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|email_spam_classification_merged_legacy107| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/legacy107/email-spam-classification-merged \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-emscad_skill_extraction_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-emscad_skill_extraction_pipeline_en.md new file mode 100644 index 00000000000000..896908e046de5e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-emscad_skill_extraction_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English emscad_skill_extraction_pipeline pipeline BertForSequenceClassification from Ivo +author: John Snow Labs +name: emscad_skill_extraction_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`emscad_skill_extraction_pipeline` is a English model originally trained by Ivo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/emscad_skill_extraction_pipeline_en_5.5.0_3.0_1727292806395.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/emscad_skill_extraction_pipeline_en_5.5.0_3.0_1727292806395.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("emscad_skill_extraction_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("emscad_skill_extraction_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|emscad_skill_extraction_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Ivo/emscad-skill-extraction + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-english_astitchtask1a_bertbasecased_falsefalse_0_3_best_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-english_astitchtask1a_bertbasecased_falsefalse_0_3_best_pipeline_en.md new file mode 100644 index 00000000000000..c4cb6c811d106d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-english_astitchtask1a_bertbasecased_falsefalse_0_3_best_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English english_astitchtask1a_bertbasecased_falsefalse_0_3_best_pipeline pipeline BertForSequenceClassification from harish +author: John Snow Labs +name: english_astitchtask1a_bertbasecased_falsefalse_0_3_best_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`english_astitchtask1a_bertbasecased_falsefalse_0_3_best_pipeline` is a English model originally trained by harish. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/english_astitchtask1a_bertbasecased_falsefalse_0_3_best_pipeline_en_5.5.0_3.0_1727292109002.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/english_astitchtask1a_bertbasecased_falsefalse_0_3_best_pipeline_en_5.5.0_3.0_1727292109002.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("english_astitchtask1a_bertbasecased_falsefalse_0_3_best_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("english_astitchtask1a_bertbasecased_falsefalse_0_3_best_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|english_astitchtask1a_bertbasecased_falsefalse_0_3_best_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/harish/EN-AStitchTask1A-BERTBaseCased-FalseFalse-0-3-BEST + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-essay_element_classifier_bert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-essay_element_classifier_bert_pipeline_en.md new file mode 100644 index 00000000000000..5e0d1954f3157c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-essay_element_classifier_bert_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English essay_element_classifier_bert_pipeline pipeline BertForSequenceClassification from terminalai +author: John Snow Labs +name: essay_element_classifier_bert_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`essay_element_classifier_bert_pipeline` is a English model originally trained by terminalai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/essay_element_classifier_bert_pipeline_en_5.5.0_3.0_1727288091063.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/essay_element_classifier_bert_pipeline_en_5.5.0_3.0_1727288091063.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("essay_element_classifier_bert_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("essay_element_classifier_bert_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|essay_element_classifier_bert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/terminalai/essay-element-classifier-bert + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-fakenews_bert_base_cased_url_en.md b/docs/_posts/ahmedlone127/2024-09-25-fakenews_bert_base_cased_url_en.md new file mode 100644 index 00000000000000..718af8d8fce439 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-fakenews_bert_base_cased_url_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English fakenews_bert_base_cased_url BertForSequenceClassification from Denyol +author: John Snow Labs +name: fakenews_bert_base_cased_url +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fakenews_bert_base_cased_url` is a English model originally trained by Denyol. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fakenews_bert_base_cased_url_en_5.5.0_3.0_1727278145876.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fakenews_bert_base_cased_url_en_5.5.0_3.0_1727278145876.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("fakenews_bert_base_cased_url","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("fakenews_bert_base_cased_url", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fakenews_bert_base_cased_url| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/Denyol/FakeNews-bert-base-cased-url \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-finbert_market_based_en.md b/docs/_posts/ahmedlone127/2024-09-25-finbert_market_based_en.md new file mode 100644 index 00000000000000..525d79642c05c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-finbert_market_based_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English finbert_market_based BertForSequenceClassification from baptle +author: John Snow Labs +name: finbert_market_based +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finbert_market_based` is a English model originally trained by baptle. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finbert_market_based_en_5.5.0_3.0_1727293518479.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finbert_market_based_en_5.5.0_3.0_1727293518479.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("finbert_market_based","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finbert_market_based", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finbert_market_based| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/baptle/FinBERT_market_based \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-finbert_tuned_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-finbert_tuned_pipeline_en.md new file mode 100644 index 00000000000000..1c7a5c75dea21f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-finbert_tuned_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English finbert_tuned_pipeline pipeline BertForSequenceClassification from manvik28 +author: John Snow Labs +name: finbert_tuned_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finbert_tuned_pipeline` is a English model originally trained by manvik28. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finbert_tuned_pipeline_en_5.5.0_3.0_1727285220802.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finbert_tuned_pipeline_en_5.5.0_3.0_1727285220802.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("finbert_tuned_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("finbert_tuned_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finbert_tuned_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/manvik28/FinBERT_Tuned + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-fine_tuned_bert_large_uncased_en.md b/docs/_posts/ahmedlone127/2024-09-25-fine_tuned_bert_large_uncased_en.md new file mode 100644 index 00000000000000..12f4e720f373e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-fine_tuned_bert_large_uncased_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English fine_tuned_bert_large_uncased BertForSequenceClassification from Mawulom +author: John Snow Labs +name: fine_tuned_bert_large_uncased +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_bert_large_uncased` is a English model originally trained by Mawulom. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_bert_large_uncased_en_5.5.0_3.0_1727308775602.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_bert_large_uncased_en_5.5.0_3.0_1727308775602.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("fine_tuned_bert_large_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("fine_tuned_bert_large_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_bert_large_uncased| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/Mawulom/Fine-Tuned-Bert-Large-Uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-fine_tuned_boolq_bert_croslo_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-fine_tuned_boolq_bert_croslo_pipeline_en.md new file mode 100644 index 00000000000000..b54bfda8f8d40b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-fine_tuned_boolq_bert_croslo_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English fine_tuned_boolq_bert_croslo_pipeline pipeline BertForSequenceClassification from lenatr99 +author: John Snow Labs +name: fine_tuned_boolq_bert_croslo_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_boolq_bert_croslo_pipeline` is a English model originally trained by lenatr99. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_boolq_bert_croslo_pipeline_en_5.5.0_3.0_1727276408688.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_boolq_bert_croslo_pipeline_en_5.5.0_3.0_1727276408688.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("fine_tuned_boolq_bert_croslo_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("fine_tuned_boolq_bert_croslo_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_boolq_bert_croslo_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|465.7 MB| + +## References + +https://huggingface.co/lenatr99/fine_tuned_boolq_bert_croslo + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-finetuned_bert_base_on_shemo_transcripts_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-finetuned_bert_base_on_shemo_transcripts_pipeline_en.md new file mode 100644 index 00000000000000..ed1359e5344026 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-finetuned_bert_base_on_shemo_transcripts_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English finetuned_bert_base_on_shemo_transcripts_pipeline pipeline BertForSequenceClassification from minoosh +author: John Snow Labs +name: finetuned_bert_base_on_shemo_transcripts_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_bert_base_on_shemo_transcripts_pipeline` is a English model originally trained by minoosh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_bert_base_on_shemo_transcripts_pipeline_en_5.5.0_3.0_1727263667569.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_bert_base_on_shemo_transcripts_pipeline_en_5.5.0_3.0_1727263667569.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("finetuned_bert_base_on_shemo_transcripts_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("finetuned_bert_base_on_shemo_transcripts_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_bert_base_on_shemo_transcripts_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/minoosh/finetuned_bert-base_on_shEMO_transcripts + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-finetuned_bert_base_uncased_olivernyu_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-finetuned_bert_base_uncased_olivernyu_pipeline_en.md new file mode 100644 index 00000000000000..c5d5e04243dde8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-finetuned_bert_base_uncased_olivernyu_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English finetuned_bert_base_uncased_olivernyu_pipeline pipeline BertForSequenceClassification from Olivernyu +author: John Snow Labs +name: finetuned_bert_base_uncased_olivernyu_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_bert_base_uncased_olivernyu_pipeline` is a English model originally trained by Olivernyu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_bert_base_uncased_olivernyu_pipeline_en_5.5.0_3.0_1727263643053.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_bert_base_uncased_olivernyu_pipeline_en_5.5.0_3.0_1727263643053.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("finetuned_bert_base_uncased_olivernyu_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("finetuned_bert_base_uncased_olivernyu_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_bert_base_uncased_olivernyu_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Olivernyu/finetuned_bert_base_uncased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-finetuned_sentiment_classfication_bert_model_slickdata_en.md b/docs/_posts/ahmedlone127/2024-09-25-finetuned_sentiment_classfication_bert_model_slickdata_en.md new file mode 100644 index 00000000000000..0baf05b36f0f40 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-finetuned_sentiment_classfication_bert_model_slickdata_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English finetuned_sentiment_classfication_bert_model_slickdata BertForSequenceClassification from slickdata +author: John Snow Labs +name: finetuned_sentiment_classfication_bert_model_slickdata +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_sentiment_classfication_bert_model_slickdata` is a English model originally trained by slickdata. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_sentiment_classfication_bert_model_slickdata_en_5.5.0_3.0_1727303432700.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_sentiment_classfication_bert_model_slickdata_en_5.5.0_3.0_1727303432700.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("finetuned_sentiment_classfication_bert_model_slickdata","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finetuned_sentiment_classfication_bert_model_slickdata", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_sentiment_classfication_bert_model_slickdata| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/slickdata/finetuned-Sentiment-classfication-BERT-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-finetuned_sentiment_classfication_bert_model_slickdata_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-finetuned_sentiment_classfication_bert_model_slickdata_pipeline_en.md new file mode 100644 index 00000000000000..40889e74da2166 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-finetuned_sentiment_classfication_bert_model_slickdata_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English finetuned_sentiment_classfication_bert_model_slickdata_pipeline pipeline BertForSequenceClassification from slickdata +author: John Snow Labs +name: finetuned_sentiment_classfication_bert_model_slickdata_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_sentiment_classfication_bert_model_slickdata_pipeline` is a English model originally trained by slickdata. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_sentiment_classfication_bert_model_slickdata_pipeline_en_5.5.0_3.0_1727303454408.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_sentiment_classfication_bert_model_slickdata_pipeline_en_5.5.0_3.0_1727303454408.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("finetuned_sentiment_classfication_bert_model_slickdata_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("finetuned_sentiment_classfication_bert_model_slickdata_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_sentiment_classfication_bert_model_slickdata_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/slickdata/finetuned-Sentiment-classfication-BERT-model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-finetuning_bert_base_uncased_on_cornell_sentiment_en.md b/docs/_posts/ahmedlone127/2024-09-25-finetuning_bert_base_uncased_on_cornell_sentiment_en.md new file mode 100644 index 00000000000000..1466a8aaa379c4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-finetuning_bert_base_uncased_on_cornell_sentiment_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English finetuning_bert_base_uncased_on_cornell_sentiment BertForSequenceClassification from Ibrahim-Alam +author: John Snow Labs +name: finetuning_bert_base_uncased_on_cornell_sentiment +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_bert_base_uncased_on_cornell_sentiment` is a English model originally trained by Ibrahim-Alam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_bert_base_uncased_on_cornell_sentiment_en_5.5.0_3.0_1727290742807.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_bert_base_uncased_on_cornell_sentiment_en_5.5.0_3.0_1727290742807.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("finetuning_bert_base_uncased_on_cornell_sentiment","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finetuning_bert_base_uncased_on_cornell_sentiment", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_bert_base_uncased_on_cornell_sentiment| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Ibrahim-Alam/finetuning-bert-base-uncased-on-Cornell_sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-forc_s1_bert_en.md b/docs/_posts/ahmedlone127/2024-09-25-forc_s1_bert_en.md new file mode 100644 index 00000000000000..f4816b5572316f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-forc_s1_bert_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English forc_s1_bert BertForSequenceClassification from ThuyNT03 +author: John Snow Labs +name: forc_s1_bert +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`forc_s1_bert` is a English model originally trained by ThuyNT03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/forc_s1_bert_en_5.5.0_3.0_1727307591066.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/forc_s1_bert_en_5.5.0_3.0_1727307591066.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("forc_s1_bert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("forc_s1_bert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|forc_s1_bert| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.7 MB| + +## References + +https://huggingface.co/ThuyNT03/FoRC_S1_BERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-frugalscore_tiny_deberta_bert_score_en.md b/docs/_posts/ahmedlone127/2024-09-25-frugalscore_tiny_deberta_bert_score_en.md new file mode 100644 index 00000000000000..75dc86cfa670a1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-frugalscore_tiny_deberta_bert_score_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English frugalscore_tiny_deberta_bert_score BertForSequenceClassification from moussaKam +author: John Snow Labs +name: frugalscore_tiny_deberta_bert_score +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`frugalscore_tiny_deberta_bert_score` is a English model originally trained by moussaKam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/frugalscore_tiny_deberta_bert_score_en_5.5.0_3.0_1727295767549.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/frugalscore_tiny_deberta_bert_score_en_5.5.0_3.0_1727295767549.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("frugalscore_tiny_deberta_bert_score","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("frugalscore_tiny_deberta_bert_score", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|frugalscore_tiny_deberta_bert_score| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/moussaKam/frugalscore_tiny_deberta_bert-score \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-frugalscore_tiny_deberta_bert_score_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-frugalscore_tiny_deberta_bert_score_pipeline_en.md new file mode 100644 index 00000000000000..e5360f55582036 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-frugalscore_tiny_deberta_bert_score_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English frugalscore_tiny_deberta_bert_score_pipeline pipeline BertForSequenceClassification from moussaKam +author: John Snow Labs +name: frugalscore_tiny_deberta_bert_score_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`frugalscore_tiny_deberta_bert_score_pipeline` is a English model originally trained by moussaKam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/frugalscore_tiny_deberta_bert_score_pipeline_en_5.5.0_3.0_1727295768767.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/frugalscore_tiny_deberta_bert_score_pipeline_en_5.5.0_3.0_1727295768767.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("frugalscore_tiny_deberta_bert_score_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("frugalscore_tiny_deberta_bert_score_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|frugalscore_tiny_deberta_bert_score_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|16.8 MB| + +## References + +https://huggingface.co/moussaKam/frugalscore_tiny_deberta_bert-score + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-gdpr_rag_en.md b/docs/_posts/ahmedlone127/2024-09-25-gdpr_rag_en.md new file mode 100644 index 00000000000000..6181ddc1296595 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-gdpr_rag_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English gdpr_rag BertForSequenceClassification from rdhinaz +author: John Snow Labs +name: gdpr_rag +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gdpr_rag` is a English model originally trained by rdhinaz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gdpr_rag_en_5.5.0_3.0_1727297221164.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gdpr_rag_en_5.5.0_3.0_1727297221164.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("gdpr_rag","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("gdpr_rag", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gdpr_rag| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/rdhinaz/GDPR-RAG \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-gdpr_rag_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-gdpr_rag_pipeline_en.md new file mode 100644 index 00000000000000..3a2c3e356a39e8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-gdpr_rag_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English gdpr_rag_pipeline pipeline BertForSequenceClassification from rdhinaz +author: John Snow Labs +name: gdpr_rag_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gdpr_rag_pipeline` is a English model originally trained by rdhinaz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gdpr_rag_pipeline_en_5.5.0_3.0_1727297243284.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gdpr_rag_pipeline_en_5.5.0_3.0_1727297243284.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("gdpr_rag_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("gdpr_rag_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gdpr_rag_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/rdhinaz/GDPR-RAG + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-genome_finder_en.md b/docs/_posts/ahmedlone127/2024-09-25-genome_finder_en.md new file mode 100644 index 00000000000000..e2cb9218ec0bce --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-genome_finder_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English genome_finder BertForSequenceClassification from rdhinaz +author: John Snow Labs +name: genome_finder +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`genome_finder` is a English model originally trained by rdhinaz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/genome_finder_en_5.5.0_3.0_1727273143520.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/genome_finder_en_5.5.0_3.0_1727273143520.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("genome_finder","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("genome_finder", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|genome_finder| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/rdhinaz/genome-finder \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-github_issues_classification_final_en.md b/docs/_posts/ahmedlone127/2024-09-25-github_issues_classification_final_en.md new file mode 100644 index 00000000000000..3bad3528c9ecbd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-github_issues_classification_final_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English github_issues_classification_final BertForSequenceClassification from peler1nl1kelt0s +author: John Snow Labs +name: github_issues_classification_final +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`github_issues_classification_final` is a English model originally trained by peler1nl1kelt0s. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/github_issues_classification_final_en_5.5.0_3.0_1727295480657.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/github_issues_classification_final_en_5.5.0_3.0_1727295480657.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("github_issues_classification_final","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("github_issues_classification_final", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|github_issues_classification_final| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/peler1nl1kelt0s/github-issues-classification-final \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-github_issues_classification_final_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-github_issues_classification_final_pipeline_en.md new file mode 100644 index 00000000000000..6aaea74d826d55 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-github_issues_classification_final_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English github_issues_classification_final_pipeline pipeline BertForSequenceClassification from peler1nl1kelt0s +author: John Snow Labs +name: github_issues_classification_final_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`github_issues_classification_final_pipeline` is a English model originally trained by peler1nl1kelt0s. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/github_issues_classification_final_pipeline_en_5.5.0_3.0_1727295502645.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/github_issues_classification_final_pipeline_en_5.5.0_3.0_1727295502645.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("github_issues_classification_final_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("github_issues_classification_final_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|github_issues_classification_final_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/peler1nl1kelt0s/github-issues-classification-final + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-hyp_only_gpt_4_filtered_final_en.md b/docs/_posts/ahmedlone127/2024-09-25-hyp_only_gpt_4_filtered_final_en.md new file mode 100644 index 00000000000000..bd6e2c74173f5f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-hyp_only_gpt_4_filtered_final_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English hyp_only_gpt_4_filtered_final BertForSequenceClassification from grace-pro +author: John Snow Labs +name: hyp_only_gpt_4_filtered_final +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hyp_only_gpt_4_filtered_final` is a English model originally trained by grace-pro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hyp_only_gpt_4_filtered_final_en_5.5.0_3.0_1727304581125.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hyp_only_gpt_4_filtered_final_en_5.5.0_3.0_1727304581125.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("hyp_only_gpt_4_filtered_final","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hyp_only_gpt_4_filtered_final", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hyp_only_gpt_4_filtered_final| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/grace-pro/hyp_only_gpt_4_filtered_final \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-hyp_only_machine_gen_temp_1_en.md b/docs/_posts/ahmedlone127/2024-09-25-hyp_only_machine_gen_temp_1_en.md new file mode 100644 index 00000000000000..c54fd9b76a452a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-hyp_only_machine_gen_temp_1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English hyp_only_machine_gen_temp_1 BertForSequenceClassification from grace-pro +author: John Snow Labs +name: hyp_only_machine_gen_temp_1 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hyp_only_machine_gen_temp_1` is a English model originally trained by grace-pro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hyp_only_machine_gen_temp_1_en_5.5.0_3.0_1727279350014.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hyp_only_machine_gen_temp_1_en_5.5.0_3.0_1727279350014.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("hyp_only_machine_gen_temp_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hyp_only_machine_gen_temp_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hyp_only_machine_gen_temp_1| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/grace-pro/hyp_only_machine_gen_temp_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-hyp_only_machine_gen_temp_1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-hyp_only_machine_gen_temp_1_pipeline_en.md new file mode 100644 index 00000000000000..ce74e0fc468d99 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-hyp_only_machine_gen_temp_1_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English hyp_only_machine_gen_temp_1_pipeline pipeline BertForSequenceClassification from grace-pro +author: John Snow Labs +name: hyp_only_machine_gen_temp_1_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hyp_only_machine_gen_temp_1_pipeline` is a English model originally trained by grace-pro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hyp_only_machine_gen_temp_1_pipeline_en_5.5.0_3.0_1727279371405.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hyp_only_machine_gen_temp_1_pipeline_en_5.5.0_3.0_1727279371405.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("hyp_only_machine_gen_temp_1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("hyp_only_machine_gen_temp_1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hyp_only_machine_gen_temp_1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/grace-pro/hyp_only_machine_gen_temp_1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-imdbreviews_classification_distilbert_v02_clf_finetuning_edmejiat_en.md b/docs/_posts/ahmedlone127/2024-09-25-imdbreviews_classification_distilbert_v02_clf_finetuning_edmejiat_en.md new file mode 100644 index 00000000000000..76602bdfa5a5d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-imdbreviews_classification_distilbert_v02_clf_finetuning_edmejiat_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English imdbreviews_classification_distilbert_v02_clf_finetuning_edmejiat BertForSequenceClassification from edmejiat +author: John Snow Labs +name: imdbreviews_classification_distilbert_v02_clf_finetuning_edmejiat +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`imdbreviews_classification_distilbert_v02_clf_finetuning_edmejiat` is a English model originally trained by edmejiat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/imdbreviews_classification_distilbert_v02_clf_finetuning_edmejiat_en_5.5.0_3.0_1727297951778.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/imdbreviews_classification_distilbert_v02_clf_finetuning_edmejiat_en_5.5.0_3.0_1727297951778.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("imdbreviews_classification_distilbert_v02_clf_finetuning_edmejiat","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("imdbreviews_classification_distilbert_v02_clf_finetuning_edmejiat", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|imdbreviews_classification_distilbert_v02_clf_finetuning_edmejiat| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|626.8 MB| + +## References + +https://huggingface.co/edmejiat/imdbreviews_classification_distilbert_v02_clf_finetuning \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-indic_bert_finetuned_legal_try_with_muril_more_ft_en.md b/docs/_posts/ahmedlone127/2024-09-25-indic_bert_finetuned_legal_try_with_muril_more_ft_en.md new file mode 100644 index 00000000000000..aa1260a9a9a837 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-indic_bert_finetuned_legal_try_with_muril_more_ft_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English indic_bert_finetuned_legal_try_with_muril_more_ft BertForSequenceClassification from PoptropicaSahil +author: John Snow Labs +name: indic_bert_finetuned_legal_try_with_muril_more_ft +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indic_bert_finetuned_legal_try_with_muril_more_ft` is a English model originally trained by PoptropicaSahil. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indic_bert_finetuned_legal_try_with_muril_more_ft_en_5.5.0_3.0_1727297903115.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indic_bert_finetuned_legal_try_with_muril_more_ft_en_5.5.0_3.0_1727297903115.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("indic_bert_finetuned_legal_try_with_muril_more_ft","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("indic_bert_finetuned_legal_try_with_muril_more_ft", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indic_bert_finetuned_legal_try_with_muril_more_ft| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|128.2 MB| + +## References + +https://huggingface.co/PoptropicaSahil/indic-bert-finetuned-legal_try_with_muril_more_ft \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-init_bert_ft_qqp_79_en.md b/docs/_posts/ahmedlone127/2024-09-25-init_bert_ft_qqp_79_en.md new file mode 100644 index 00000000000000..dfcaa4cc187a1e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-init_bert_ft_qqp_79_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English init_bert_ft_qqp_79 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: init_bert_ft_qqp_79 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`init_bert_ft_qqp_79` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/init_bert_ft_qqp_79_en_5.5.0_3.0_1727286514472.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/init_bert_ft_qqp_79_en_5.5.0_3.0_1727286514472.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("init_bert_ft_qqp_79","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("init_bert_ft_qqp_79", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|init_bert_ft_qqp_79| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/init_bert_ft_qqp-79 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-init_bert_ft_qqp_79_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-init_bert_ft_qqp_79_pipeline_en.md new file mode 100644 index 00000000000000..d879bc4ebe454a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-init_bert_ft_qqp_79_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English init_bert_ft_qqp_79_pipeline pipeline BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: init_bert_ft_qqp_79_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`init_bert_ft_qqp_79_pipeline` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/init_bert_ft_qqp_79_pipeline_en_5.5.0_3.0_1727286536176.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/init_bert_ft_qqp_79_pipeline_en_5.5.0_3.0_1727286536176.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("init_bert_ft_qqp_79_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("init_bert_ft_qqp_79_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|init_bert_ft_qqp_79_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/init_bert_ft_qqp-79 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-is712_en.md b/docs/_posts/ahmedlone127/2024-09-25-is712_en.md new file mode 100644 index 00000000000000..7f1f5f773d083b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-is712_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English is712 BertForSequenceClassification from Messerschmitt +author: John Snow Labs +name: is712 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`is712` is a English model originally trained by Messerschmitt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/is712_en_5.5.0_3.0_1727267466592.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/is712_en_5.5.0_3.0_1727267466592.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("is712","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("is712", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|is712| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.3 MB| + +## References + +https://huggingface.co/Messerschmitt/is712 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-kemenkeu_sentiment_classifier_id.md b/docs/_posts/ahmedlone127/2024-09-25-kemenkeu_sentiment_classifier_id.md new file mode 100644 index 00000000000000..7b35bbb2d19c55 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-kemenkeu_sentiment_classifier_id.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Indonesian kemenkeu_sentiment_classifier BertForSequenceClassification from hanifnoerr +author: John Snow Labs +name: kemenkeu_sentiment_classifier +date: 2024-09-25 +tags: [id, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: id +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kemenkeu_sentiment_classifier` is a Indonesian model originally trained by hanifnoerr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kemenkeu_sentiment_classifier_id_5.5.0_3.0_1727305127973.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kemenkeu_sentiment_classifier_id_5.5.0_3.0_1727305127973.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("kemenkeu_sentiment_classifier","id") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("kemenkeu_sentiment_classifier", "id") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kemenkeu_sentiment_classifier| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|id| +|Size:|466.4 MB| + +## References + +https://huggingface.co/hanifnoerr/Kemenkeu-Sentiment-Classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-khadija_ner_en.md b/docs/_posts/ahmedlone127/2024-09-25-khadija_ner_en.md new file mode 100644 index 00000000000000..aaebf4dad2760a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-khadija_ner_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English khadija_ner BertForTokenClassification from didazz +author: John Snow Labs +name: khadija_ner +date: 2024-09-25 +tags: [en, open_source, onnx, token_classification, bert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`khadija_ner` is a English model originally trained by didazz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/khadija_ner_en_5.5.0_3.0_1727283149808.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/khadija_ner_en_5.5.0_3.0_1727283149808.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = BertForTokenClassification.pretrained("khadija_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("khadija_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|khadija_ner| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/didazz/khadija_ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-knots_distillprotbert_alphafold_en.md b/docs/_posts/ahmedlone127/2024-09-25-knots_distillprotbert_alphafold_en.md new file mode 100644 index 00000000000000..760ff504863ea9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-knots_distillprotbert_alphafold_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English knots_distillprotbert_alphafold BertForSequenceClassification from EvaKlimentova +author: John Snow Labs +name: knots_distillprotbert_alphafold +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`knots_distillprotbert_alphafold` is a English model originally trained by EvaKlimentova. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/knots_distillprotbert_alphafold_en_5.5.0_3.0_1727306987195.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/knots_distillprotbert_alphafold_en_5.5.0_3.0_1727306987195.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("knots_distillprotbert_alphafold","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("knots_distillprotbert_alphafold", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|knots_distillprotbert_alphafold| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|864.5 MB| + +## References + +https://huggingface.co/EvaKlimentova/knots_distillprotbert_alphafold \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-knots_distillprotbert_alphafold_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-knots_distillprotbert_alphafold_pipeline_en.md new file mode 100644 index 00000000000000..2b488f20f75729 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-knots_distillprotbert_alphafold_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English knots_distillprotbert_alphafold_pipeline pipeline BertForSequenceClassification from EvaKlimentova +author: John Snow Labs +name: knots_distillprotbert_alphafold_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`knots_distillprotbert_alphafold_pipeline` is a English model originally trained by EvaKlimentova. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/knots_distillprotbert_alphafold_pipeline_en_5.5.0_3.0_1727307032305.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/knots_distillprotbert_alphafold_pipeline_en_5.5.0_3.0_1727307032305.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("knots_distillprotbert_alphafold_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("knots_distillprotbert_alphafold_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|knots_distillprotbert_alphafold_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|864.6 MB| + +## References + +https://huggingface.co/EvaKlimentova/knots_distillprotbert_alphafold + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-kor_naver_ner_name_en.md b/docs/_posts/ahmedlone127/2024-09-25-kor_naver_ner_name_en.md new file mode 100644 index 00000000000000..1c6582134037b1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-kor_naver_ner_name_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English kor_naver_ner_name BertForTokenClassification from joon09 +author: John Snow Labs +name: kor_naver_ner_name +date: 2024-09-25 +tags: [en, open_source, onnx, token_classification, bert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kor_naver_ner_name` is a English model originally trained by joon09. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kor_naver_ner_name_en_5.5.0_3.0_1727262951717.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kor_naver_ner_name_en_5.5.0_3.0_1727262951717.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = BertForTokenClassification.pretrained("kor_naver_ner_name","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("kor_naver_ner_name", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kor_naver_ner_name| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|441.2 MB| + +## References + +https://huggingface.co/joon09/kor-naver-ner-name \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-korean_disease_ner_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-korean_disease_ner_pipeline_en.md new file mode 100644 index 00000000000000..4d8dcdb73a9f42 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-korean_disease_ner_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English korean_disease_ner_pipeline pipeline BertForTokenClassification from keonju +author: John Snow Labs +name: korean_disease_ner_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`korean_disease_ner_pipeline` is a English model originally trained by keonju. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/korean_disease_ner_pipeline_en_5.5.0_3.0_1727283045139.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/korean_disease_ner_pipeline_en_5.5.0_3.0_1727283045139.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("korean_disease_ner_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("korean_disease_ner_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|korean_disease_ner_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|412.5 MB| + +## References + +https://huggingface.co/keonju/korean_disease_ner + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-l_12_h_512_a_8_sst2_en.md b/docs/_posts/ahmedlone127/2024-09-25-l_12_h_512_a_8_sst2_en.md new file mode 100644 index 00000000000000..41c7c813ab9410 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-l_12_h_512_a_8_sst2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English l_12_h_512_a_8_sst2 BertForSequenceClassification from Sayan01 +author: John Snow Labs +name: l_12_h_512_a_8_sst2 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`l_12_h_512_a_8_sst2` is a English model originally trained by Sayan01. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/l_12_h_512_a_8_sst2_en_5.5.0_3.0_1727267394261.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/l_12_h_512_a_8_sst2_en_5.5.0_3.0_1727267394261.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("l_12_h_512_a_8_sst2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("l_12_h_512_a_8_sst2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|l_12_h_512_a_8_sst2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|202.4 MB| + +## References + +https://huggingface.co/Sayan01/L-12_H-512_A-8_sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-learn2therm_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-learn2therm_pipeline_en.md new file mode 100644 index 00000000000000..1142a13e315e68 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-learn2therm_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English learn2therm_pipeline pipeline BertForSequenceClassification from evankomp +author: John Snow Labs +name: learn2therm_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`learn2therm_pipeline` is a English model originally trained by evankomp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/learn2therm_pipeline_en_5.5.0_3.0_1727304871925.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/learn2therm_pipeline_en_5.5.0_3.0_1727304871925.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("learn2therm_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("learn2therm_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|learn2therm_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.6 GB| + +## References + +https://huggingface.co/evankomp/learn2therm + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-legal_bert_samoan_gen1_large_summarized_chuvash_4_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-legal_bert_samoan_gen1_large_summarized_chuvash_4_pipeline_en.md new file mode 100644 index 00000000000000..fbd2fafca28a46 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-legal_bert_samoan_gen1_large_summarized_chuvash_4_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English legal_bert_samoan_gen1_large_summarized_chuvash_4_pipeline pipeline BertForSequenceClassification from wiorz +author: John Snow Labs +name: legal_bert_samoan_gen1_large_summarized_chuvash_4_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`legal_bert_samoan_gen1_large_summarized_chuvash_4_pipeline` is a English model originally trained by wiorz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/legal_bert_samoan_gen1_large_summarized_chuvash_4_pipeline_en_5.5.0_3.0_1727288653152.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/legal_bert_samoan_gen1_large_summarized_chuvash_4_pipeline_en_5.5.0_3.0_1727288653152.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("legal_bert_samoan_gen1_large_summarized_chuvash_4_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("legal_bert_samoan_gen1_large_summarized_chuvash_4_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|legal_bert_samoan_gen1_large_summarized_chuvash_4_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/wiorz/legal_bert_sm_gen1_large_summarized_cv_4 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-legal_ner_finetuned_en.md b/docs/_posts/ahmedlone127/2024-09-25-legal_ner_finetuned_en.md new file mode 100644 index 00000000000000..6536a94d0fac53 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-legal_ner_finetuned_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English legal_ner_finetuned BertForTokenClassification from AjayMukundS +author: John Snow Labs +name: legal_ner_finetuned +date: 2024-09-25 +tags: [en, open_source, onnx, token_classification, bert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`legal_ner_finetuned` is a English model originally trained by AjayMukundS. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/legal_ner_finetuned_en_5.5.0_3.0_1727271184973.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/legal_ner_finetuned_en_5.5.0_3.0_1727271184973.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = BertForTokenClassification.pretrained("legal_ner_finetuned","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("legal_ner_finetuned", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|legal_ner_finetuned| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/AjayMukundS/Legal-NER-Finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-lngdistilledmodel_en.md b/docs/_posts/ahmedlone127/2024-09-25-lngdistilledmodel_en.md new file mode 100644 index 00000000000000..082450343ed3b5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-lngdistilledmodel_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English lngdistilledmodel BertForSequenceClassification from privacy-tech-lab +author: John Snow Labs +name: lngdistilledmodel +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`lngdistilledmodel` is a English model originally trained by privacy-tech-lab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/lngdistilledmodel_en_5.5.0_3.0_1727272892393.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/lngdistilledmodel_en_5.5.0_3.0_1727272892393.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("lngdistilledmodel","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("lngdistilledmodel", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|lngdistilledmodel| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|54.2 MB| + +## References + +https://huggingface.co/privacy-tech-lab/LngDistilledModel \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-mbert_fc_pipeline_vi.md b/docs/_posts/ahmedlone127/2024-09-25-mbert_fc_pipeline_vi.md new file mode 100644 index 00000000000000..a63c610217195e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-mbert_fc_pipeline_vi.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Vietnamese mbert_fc_pipeline pipeline BertForSequenceClassification from SonFox2920 +author: John Snow Labs +name: mbert_fc_pipeline +date: 2024-09-25 +tags: [vi, open_source, pipeline, onnx] +task: Text Classification +language: vi +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mbert_fc_pipeline` is a Vietnamese model originally trained by SonFox2920. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mbert_fc_pipeline_vi_5.5.0_3.0_1727290023120.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mbert_fc_pipeline_vi_5.5.0_3.0_1727290023120.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mbert_fc_pipeline", lang = "vi") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mbert_fc_pipeline", lang = "vi") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mbert_fc_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|vi| +|Size:|667.3 MB| + +## References + +https://huggingface.co/SonFox2920/MBert_FC + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-mbert_fc_vi.md b/docs/_posts/ahmedlone127/2024-09-25-mbert_fc_vi.md new file mode 100644 index 00000000000000..bdb32b133574ad --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-mbert_fc_vi.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Vietnamese mbert_fc BertForSequenceClassification from SonFox2920 +author: John Snow Labs +name: mbert_fc +date: 2024-09-25 +tags: [vi, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: vi +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mbert_fc` is a Vietnamese model originally trained by SonFox2920. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mbert_fc_vi_5.5.0_3.0_1727289989246.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mbert_fc_vi_5.5.0_3.0_1727289989246.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("mbert_fc","vi") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mbert_fc", "vi") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mbert_fc| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|vi| +|Size:|667.3 MB| + +## References + +https://huggingface.co/SonFox2920/MBert_FC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-memo_bert_wsd_01_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-memo_bert_wsd_01_pipeline_en.md new file mode 100644 index 00000000000000..c063a9a28f0d7e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-memo_bert_wsd_01_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English memo_bert_wsd_01_pipeline pipeline BertForSequenceClassification from yemen2016 +author: John Snow Labs +name: memo_bert_wsd_01_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`memo_bert_wsd_01_pipeline` is a English model originally trained by yemen2016. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/memo_bert_wsd_01_pipeline_en_5.5.0_3.0_1727285953316.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/memo_bert_wsd_01_pipeline_en_5.5.0_3.0_1727285953316.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("memo_bert_wsd_01_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("memo_bert_wsd_01_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|memo_bert_wsd_01_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|410.3 MB| + +## References + +https://huggingface.co/yemen2016/MeMo_BERT-WSD-01 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-miniproject_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-miniproject_pipeline_en.md new file mode 100644 index 00000000000000..0e4f8ae6d5589a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-miniproject_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English miniproject_pipeline pipeline BertForSequenceClassification from samirangupta31 +author: John Snow Labs +name: miniproject_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`miniproject_pipeline` is a English model originally trained by samirangupta31. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/miniproject_pipeline_en_5.5.0_3.0_1727289367899.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/miniproject_pipeline_en_5.5.0_3.0_1727289367899.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("miniproject_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("miniproject_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|miniproject_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.0 MB| + +## References + +https://huggingface.co/samirangupta31/MiniProject + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-misinformation_covid_bert_base_german_cased_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-misinformation_covid_bert_base_german_cased_pipeline_en.md new file mode 100644 index 00000000000000..d625d587d3023a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-misinformation_covid_bert_base_german_cased_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English misinformation_covid_bert_base_german_cased_pipeline pipeline BertForSequenceClassification from Ghunghru +author: John Snow Labs +name: misinformation_covid_bert_base_german_cased_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`misinformation_covid_bert_base_german_cased_pipeline` is a English model originally trained by Ghunghru. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/misinformation_covid_bert_base_german_cased_pipeline_en_5.5.0_3.0_1727287963886.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/misinformation_covid_bert_base_german_cased_pipeline_en_5.5.0_3.0_1727287963886.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("misinformation_covid_bert_base_german_cased_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("misinformation_covid_bert_base_german_cased_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|misinformation_covid_bert_base_german_cased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.1 MB| + +## References + +https://huggingface.co/Ghunghru/Misinformation-Covid-bert-base-german-cased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-mobilebert_stsb_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-mobilebert_stsb_pipeline_en.md new file mode 100644 index 00000000000000..03506ea6493716 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-mobilebert_stsb_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English mobilebert_stsb_pipeline pipeline BertForSequenceClassification from Alireza1044 +author: John Snow Labs +name: mobilebert_stsb_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mobilebert_stsb_pipeline` is a English model originally trained by Alireza1044. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mobilebert_stsb_pipeline_en_5.5.0_3.0_1727287383623.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mobilebert_stsb_pipeline_en_5.5.0_3.0_1727287383623.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mobilebert_stsb_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mobilebert_stsb_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mobilebert_stsb_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|92.6 MB| + +## References + +https://huggingface.co/Alireza1044/mobilebert_stsb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-modela_1_12_2023_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-modela_1_12_2023_pipeline_en.md new file mode 100644 index 00000000000000..e4bd049f054502 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-modela_1_12_2023_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English modela_1_12_2023_pipeline pipeline BertForTokenClassification from MaryDatascientist +author: John Snow Labs +name: modela_1_12_2023_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`modela_1_12_2023_pipeline` is a English model originally trained by MaryDatascientist. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/modela_1_12_2023_pipeline_en_5.5.0_3.0_1727264626556.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/modela_1_12_2023_pipeline_en_5.5.0_3.0_1727264626556.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("modela_1_12_2023_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("modela_1_12_2023_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|modela_1_12_2023_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.3 MB| + +## References + +https://huggingface.co/MaryDatascientist/modelA_1_12_2023 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-modlee_transformer_en.md b/docs/_posts/ahmedlone127/2024-09-25-modlee_transformer_en.md new file mode 100644 index 00000000000000..7675995d0e22c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-modlee_transformer_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English modlee_transformer BertForSequenceClassification from harshitakukreja +author: John Snow Labs +name: modlee_transformer +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`modlee_transformer` is a English model originally trained by harshitakukreja. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/modlee_transformer_en_5.5.0_3.0_1727273270830.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/modlee_transformer_en_5.5.0_3.0_1727273270830.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("modlee_transformer","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("modlee_transformer", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|modlee_transformer| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/harshitakukreja/modlee_transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-monglish_arabic_faq_v2_pipeline_ar.md b/docs/_posts/ahmedlone127/2024-09-25-monglish_arabic_faq_v2_pipeline_ar.md new file mode 100644 index 00000000000000..bd3dfe730bf025 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-monglish_arabic_faq_v2_pipeline_ar.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Arabic monglish_arabic_faq_v2_pipeline pipeline BertForSequenceClassification from Ahmedhany216 +author: John Snow Labs +name: monglish_arabic_faq_v2_pipeline +date: 2024-09-25 +tags: [ar, open_source, pipeline, onnx] +task: Text Classification +language: ar +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`monglish_arabic_faq_v2_pipeline` is a Arabic model originally trained by Ahmedhany216. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/monglish_arabic_faq_v2_pipeline_ar_5.5.0_3.0_1727292098628.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/monglish_arabic_faq_v2_pipeline_ar_5.5.0_3.0_1727292098628.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("monglish_arabic_faq_v2_pipeline", lang = "ar") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("monglish_arabic_faq_v2_pipeline", lang = "ar") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|monglish_arabic_faq_v2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|ar| +|Size:|408.7 MB| + +## References + +https://huggingface.co/Ahmedhany216/Monglish_Arabic_FAQ-V2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-moviebertreview_sentimentprediction_model_imalexianne_en.md b/docs/_posts/ahmedlone127/2024-09-25-moviebertreview_sentimentprediction_model_imalexianne_en.md new file mode 100644 index 00000000000000..a09f5ead57d759 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-moviebertreview_sentimentprediction_model_imalexianne_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English moviebertreview_sentimentprediction_model_imalexianne BertForSequenceClassification from imalexianne +author: John Snow Labs +name: moviebertreview_sentimentprediction_model_imalexianne +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`moviebertreview_sentimentprediction_model_imalexianne` is a English model originally trained by imalexianne. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/moviebertreview_sentimentprediction_model_imalexianne_en_5.5.0_3.0_1727288190898.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/moviebertreview_sentimentprediction_model_imalexianne_en_5.5.0_3.0_1727288190898.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("moviebertreview_sentimentprediction_model_imalexianne","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("moviebertreview_sentimentprediction_model_imalexianne", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|moviebertreview_sentimentprediction_model_imalexianne| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/imalexianne/MovieBertReview-SentimentPrediction-Model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-moviebertreview_sentimentprediction_model_imalexianne_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-moviebertreview_sentimentprediction_model_imalexianne_pipeline_en.md new file mode 100644 index 00000000000000..6fa42da5fe8303 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-moviebertreview_sentimentprediction_model_imalexianne_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English moviebertreview_sentimentprediction_model_imalexianne_pipeline pipeline BertForSequenceClassification from imalexianne +author: John Snow Labs +name: moviebertreview_sentimentprediction_model_imalexianne_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`moviebertreview_sentimentprediction_model_imalexianne_pipeline` is a English model originally trained by imalexianne. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/moviebertreview_sentimentprediction_model_imalexianne_pipeline_en_5.5.0_3.0_1727288217032.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/moviebertreview_sentimentprediction_model_imalexianne_pipeline_en_5.5.0_3.0_1727288217032.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("moviebertreview_sentimentprediction_model_imalexianne_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("moviebertreview_sentimentprediction_model_imalexianne_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|moviebertreview_sentimentprediction_model_imalexianne_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/imalexianne/MovieBertReview-SentimentPrediction-Model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-mrpc_output_en.md b/docs/_posts/ahmedlone127/2024-09-25-mrpc_output_en.md new file mode 100644 index 00000000000000..3c2237d3336bc6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-mrpc_output_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English mrpc_output BertForSequenceClassification from shivangi +author: John Snow Labs +name: mrpc_output +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mrpc_output` is a English model originally trained by shivangi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mrpc_output_en_5.5.0_3.0_1727292031298.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mrpc_output_en_5.5.0_3.0_1727292031298.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("mrpc_output","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mrpc_output", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mrpc_output| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/shivangi/MRPC_output \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-mrpc_output_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-mrpc_output_pipeline_en.md new file mode 100644 index 00000000000000..ccdd0eb5587f40 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-mrpc_output_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English mrpc_output_pipeline pipeline BertForSequenceClassification from shivangi +author: John Snow Labs +name: mrpc_output_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mrpc_output_pipeline` is a English model originally trained by shivangi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mrpc_output_pipeline_en_5.5.0_3.0_1727292052387.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mrpc_output_pipeline_en_5.5.0_3.0_1727292052387.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mrpc_output_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mrpc_output_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mrpc_output_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/shivangi/MRPC_output + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-n_bert_imdb_padding100model_en.md b/docs/_posts/ahmedlone127/2024-09-25-n_bert_imdb_padding100model_en.md new file mode 100644 index 00000000000000..d141ee1420147d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-n_bert_imdb_padding100model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English n_bert_imdb_padding100model BertForSequenceClassification from Realgon +author: John Snow Labs +name: n_bert_imdb_padding100model +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`n_bert_imdb_padding100model` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/n_bert_imdb_padding100model_en_5.5.0_3.0_1727301351880.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/n_bert_imdb_padding100model_en_5.5.0_3.0_1727301351880.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("n_bert_imdb_padding100model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("n_bert_imdb_padding100model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|n_bert_imdb_padding100model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.7 MB| + +## References + +https://huggingface.co/Realgon/N_bert_imdb_padding100model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-n_bert_twitterfin_padding20model_en.md b/docs/_posts/ahmedlone127/2024-09-25-n_bert_twitterfin_padding20model_en.md new file mode 100644 index 00000000000000..91e93faada8229 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-n_bert_twitterfin_padding20model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English n_bert_twitterfin_padding20model BertForSequenceClassification from Realgon +author: John Snow Labs +name: n_bert_twitterfin_padding20model +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`n_bert_twitterfin_padding20model` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/n_bert_twitterfin_padding20model_en_5.5.0_3.0_1727290264758.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/n_bert_twitterfin_padding20model_en_5.5.0_3.0_1727290264758.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("n_bert_twitterfin_padding20model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("n_bert_twitterfin_padding20model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|n_bert_twitterfin_padding20model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/Realgon/N_bert_twitterfin_padding20model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-name_anonymization_pipeline_tr.md b/docs/_posts/ahmedlone127/2024-09-25-name_anonymization_pipeline_tr.md new file mode 100644 index 00000000000000..346cf8ef2b3ab1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-name_anonymization_pipeline_tr.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Turkish name_anonymization_pipeline pipeline BertForTokenClassification from deprem-ml +author: John Snow Labs +name: name_anonymization_pipeline +date: 2024-09-25 +tags: [tr, open_source, pipeline, onnx] +task: Named Entity Recognition +language: tr +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`name_anonymization_pipeline` is a Turkish model originally trained by deprem-ml. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/name_anonymization_pipeline_tr_5.5.0_3.0_1727284040918.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/name_anonymization_pipeline_tr_5.5.0_3.0_1727284040918.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("name_anonymization_pipeline", lang = "tr") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("name_anonymization_pipeline", lang = "tr") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|name_anonymization_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|tr| +|Size:|412.3 MB| + +## References + +https://huggingface.co/deprem-ml/name_anonymization + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-nameattrsbertfinal_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-nameattrsbertfinal_pipeline_en.md new file mode 100644 index 00000000000000..fe00b16957f738 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-nameattrsbertfinal_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English nameattrsbertfinal_pipeline pipeline BertForSequenceClassification from madgnome +author: John Snow Labs +name: nameattrsbertfinal_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nameattrsbertfinal_pipeline` is a English model originally trained by madgnome. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nameattrsbertfinal_pipeline_en_5.5.0_3.0_1727293834116.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nameattrsbertfinal_pipeline_en_5.5.0_3.0_1727293834116.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("nameattrsbertfinal_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("nameattrsbertfinal_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nameattrsbertfinal_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|669.0 MB| + +## References + +https://huggingface.co/madgnome/nameattrsbertfinal + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-nusabert_base_indonesian_plutchik_emotion_analysis_id.md b/docs/_posts/ahmedlone127/2024-09-25-nusabert_base_indonesian_plutchik_emotion_analysis_id.md new file mode 100644 index 00000000000000..6812a81b7be6ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-nusabert_base_indonesian_plutchik_emotion_analysis_id.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Indonesian nusabert_base_indonesian_plutchik_emotion_analysis BertForSequenceClassification from Aardiiiiy +author: John Snow Labs +name: nusabert_base_indonesian_plutchik_emotion_analysis +date: 2024-09-25 +tags: [id, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: id +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nusabert_base_indonesian_plutchik_emotion_analysis` is a Indonesian model originally trained by Aardiiiiy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nusabert_base_indonesian_plutchik_emotion_analysis_id_5.5.0_3.0_1727238055468.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nusabert_base_indonesian_plutchik_emotion_analysis_id_5.5.0_3.0_1727238055468.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("nusabert_base_indonesian_plutchik_emotion_analysis","id") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("nusabert_base_indonesian_plutchik_emotion_analysis", "id") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nusabert_base_indonesian_plutchik_emotion_analysis| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|id| +|Size:|414.6 MB| + +## References + +https://huggingface.co/Aardiiiiy/NusaBERT-base-Indonesian-Plutchik-emotion-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-only_title_desc_en.md b/docs/_posts/ahmedlone127/2024-09-25-only_title_desc_en.md new file mode 100644 index 00000000000000..b3bda64e107b30 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-only_title_desc_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English only_title_desc BertForSequenceClassification from SajjadAyoubi +author: John Snow Labs +name: only_title_desc +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`only_title_desc` is a English model originally trained by SajjadAyoubi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/only_title_desc_en_5.5.0_3.0_1727299754353.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/only_title_desc_en_5.5.0_3.0_1727299754353.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("only_title_desc","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("only_title_desc", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|only_title_desc| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|443.8 MB| + +## References + +https://huggingface.co/SajjadAyoubi/only-title-desc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-opus_em_augmented_en.md b/docs/_posts/ahmedlone127/2024-09-25-opus_em_augmented_en.md new file mode 100644 index 00000000000000..da93355ce6faf9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-opus_em_augmented_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English opus_em_augmented BertForSequenceClassification from keremp +author: John Snow Labs +name: opus_em_augmented +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`opus_em_augmented` is a English model originally trained by keremp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/opus_em_augmented_en_5.5.0_3.0_1727267653690.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/opus_em_augmented_en_5.5.0_3.0_1727267653690.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("opus_em_augmented","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("opus_em_augmented", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|opus_em_augmented| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/keremp/opus-em-augmented \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-opus_em_bge_base_en.md b/docs/_posts/ahmedlone127/2024-09-25-opus_em_bge_base_en.md new file mode 100644 index 00000000000000..d6d2ff951e3c49 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-opus_em_bge_base_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English opus_em_bge_base BertForSequenceClassification from keremp +author: John Snow Labs +name: opus_em_bge_base +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`opus_em_bge_base` is a English model originally trained by keremp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/opus_em_bge_base_en_5.5.0_3.0_1727305656269.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/opus_em_bge_base_en_5.5.0_3.0_1727305656269.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("opus_em_bge_base","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("opus_em_bge_base", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|opus_em_bge_base| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|381.3 MB| + +## References + +https://huggingface.co/keremp/opus-em-bge-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-opus_em_bge_base_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-opus_em_bge_base_pipeline_en.md new file mode 100644 index 00000000000000..631399498c423b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-opus_em_bge_base_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English opus_em_bge_base_pipeline pipeline BertForSequenceClassification from keremp +author: John Snow Labs +name: opus_em_bge_base_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`opus_em_bge_base_pipeline` is a English model originally trained by keremp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/opus_em_bge_base_pipeline_en_5.5.0_3.0_1727305687287.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/opus_em_bge_base_pipeline_en_5.5.0_3.0_1727305687287.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("opus_em_bge_base_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("opus_em_bge_base_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|opus_em_bge_base_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|381.4 MB| + +## References + +https://huggingface.co/keremp/opus-em-bge-base + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-pabee_bert_base_sst2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-pabee_bert_base_sst2_pipeline_en.md new file mode 100644 index 00000000000000..87a27d49974f6b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-pabee_bert_base_sst2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English pabee_bert_base_sst2_pipeline pipeline BertForSequenceClassification from mattymchen +author: John Snow Labs +name: pabee_bert_base_sst2_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pabee_bert_base_sst2_pipeline` is a English model originally trained by mattymchen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pabee_bert_base_sst2_pipeline_en_5.5.0_3.0_1727276229689.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pabee_bert_base_sst2_pipeline_en_5.5.0_3.0_1727276229689.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("pabee_bert_base_sst2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("pabee_bert_base_sst2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pabee_bert_base_sst2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/mattymchen/pabee-bert-base-sst2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-permissions_bert_uncased_en.md b/docs/_posts/ahmedlone127/2024-09-25-permissions_bert_uncased_en.md new file mode 100644 index 00000000000000..9af23ce13fc50c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-permissions_bert_uncased_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English permissions_bert_uncased BertForSequenceClassification from etham13 +author: John Snow Labs +name: permissions_bert_uncased +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`permissions_bert_uncased` is a English model originally trained by etham13. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/permissions_bert_uncased_en_5.5.0_3.0_1727295987748.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/permissions_bert_uncased_en_5.5.0_3.0_1727295987748.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("permissions_bert_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("permissions_bert_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|permissions_bert_uncased| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/etham13/permissions_bert_uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-permissions_bert_uncased_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-permissions_bert_uncased_pipeline_en.md new file mode 100644 index 00000000000000..bed00c99c5718e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-permissions_bert_uncased_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English permissions_bert_uncased_pipeline pipeline BertForSequenceClassification from etham13 +author: John Snow Labs +name: permissions_bert_uncased_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`permissions_bert_uncased_pipeline` is a English model originally trained by etham13. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/permissions_bert_uncased_pipeline_en_5.5.0_3.0_1727296013208.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/permissions_bert_uncased_pipeline_en_5.5.0_3.0_1727296013208.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("permissions_bert_uncased_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("permissions_bert_uncased_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|permissions_bert_uncased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/etham13/permissions_bert_uncased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-phishing_detection_en.md b/docs/_posts/ahmedlone127/2024-09-25-phishing_detection_en.md new file mode 100644 index 00000000000000..5ca6668167b651 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-phishing_detection_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English phishing_detection BertForSequenceClassification from Wachu2005 +author: John Snow Labs +name: phishing_detection +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phishing_detection` is a English model originally trained by Wachu2005. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phishing_detection_en_5.5.0_3.0_1727273097216.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phishing_detection_en_5.5.0_3.0_1727273097216.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("phishing_detection","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("phishing_detection", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phishing_detection| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/Wachu2005/Phishing_detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-phishing_detection_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-phishing_detection_pipeline_en.md new file mode 100644 index 00000000000000..bd1e400d4b90d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-phishing_detection_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English phishing_detection_pipeline pipeline BertForSequenceClassification from Wachu2005 +author: John Snow Labs +name: phishing_detection_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phishing_detection_pipeline` is a English model originally trained by Wachu2005. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phishing_detection_pipeline_en_5.5.0_3.0_1727273169605.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phishing_detection_pipeline_en_5.5.0_3.0_1727273169605.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("phishing_detection_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("phishing_detection_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phishing_detection_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/Wachu2005/Phishing_detection + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-phishing_email_detection1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-phishing_email_detection1_pipeline_en.md new file mode 100644 index 00000000000000..e26b3d4604580c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-phishing_email_detection1_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English phishing_email_detection1_pipeline pipeline BertForSequenceClassification from kithangw +author: John Snow Labs +name: phishing_email_detection1_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phishing_email_detection1_pipeline` is a English model originally trained by kithangw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phishing_email_detection1_pipeline_en_5.5.0_3.0_1727269660001.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phishing_email_detection1_pipeline_en_5.5.0_3.0_1727269660001.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("phishing_email_detection1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("phishing_email_detection1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phishing_email_detection1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|16.8 MB| + +## References + +https://huggingface.co/kithangw/phishing_email_detection1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-phrasebank_sentiment_analysis_isrraelmendoza92_en.md b/docs/_posts/ahmedlone127/2024-09-25-phrasebank_sentiment_analysis_isrraelmendoza92_en.md new file mode 100644 index 00000000000000..c431c29977b13a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-phrasebank_sentiment_analysis_isrraelmendoza92_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English phrasebank_sentiment_analysis_isrraelmendoza92 BertForSequenceClassification from isrraelmendoza92 +author: John Snow Labs +name: phrasebank_sentiment_analysis_isrraelmendoza92 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phrasebank_sentiment_analysis_isrraelmendoza92` is a English model originally trained by isrraelmendoza92. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_isrraelmendoza92_en_5.5.0_3.0_1727306179028.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_isrraelmendoza92_en_5.5.0_3.0_1727306179028.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("phrasebank_sentiment_analysis_isrraelmendoza92","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("phrasebank_sentiment_analysis_isrraelmendoza92", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phrasebank_sentiment_analysis_isrraelmendoza92| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/isrraelmendoza92/phrasebank-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-phrasebank_sentiment_analysis_jpbianchi_en.md b/docs/_posts/ahmedlone127/2024-09-25-phrasebank_sentiment_analysis_jpbianchi_en.md new file mode 100644 index 00000000000000..4f33a01cd77490 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-phrasebank_sentiment_analysis_jpbianchi_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English phrasebank_sentiment_analysis_jpbianchi BertForSequenceClassification from JPBianchi +author: John Snow Labs +name: phrasebank_sentiment_analysis_jpbianchi +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phrasebank_sentiment_analysis_jpbianchi` is a English model originally trained by JPBianchi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_jpbianchi_en_5.5.0_3.0_1727285679851.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_jpbianchi_en_5.5.0_3.0_1727285679851.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("phrasebank_sentiment_analysis_jpbianchi","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("phrasebank_sentiment_analysis_jpbianchi", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phrasebank_sentiment_analysis_jpbianchi| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/JPBianchi/phrasebank-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-phrasebank_sentiment_analysis_nikolasmoya_en.md b/docs/_posts/ahmedlone127/2024-09-25-phrasebank_sentiment_analysis_nikolasmoya_en.md new file mode 100644 index 00000000000000..0a265ff5979584 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-phrasebank_sentiment_analysis_nikolasmoya_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English phrasebank_sentiment_analysis_nikolasmoya BertForSequenceClassification from nikolasmoya +author: John Snow Labs +name: phrasebank_sentiment_analysis_nikolasmoya +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phrasebank_sentiment_analysis_nikolasmoya` is a English model originally trained by nikolasmoya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_nikolasmoya_en_5.5.0_3.0_1727301829571.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_nikolasmoya_en_5.5.0_3.0_1727301829571.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("phrasebank_sentiment_analysis_nikolasmoya","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("phrasebank_sentiment_analysis_nikolasmoya", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phrasebank_sentiment_analysis_nikolasmoya| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/nikolasmoya/phrasebank-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-phrasebank_sentiment_analysis_nikolasmoya_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-phrasebank_sentiment_analysis_nikolasmoya_pipeline_en.md new file mode 100644 index 00000000000000..2ad37cf4bc31ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-phrasebank_sentiment_analysis_nikolasmoya_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English phrasebank_sentiment_analysis_nikolasmoya_pipeline pipeline BertForSequenceClassification from nikolasmoya +author: John Snow Labs +name: phrasebank_sentiment_analysis_nikolasmoya_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phrasebank_sentiment_analysis_nikolasmoya_pipeline` is a English model originally trained by nikolasmoya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_nikolasmoya_pipeline_en_5.5.0_3.0_1727301851762.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_nikolasmoya_pipeline_en_5.5.0_3.0_1727301851762.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("phrasebank_sentiment_analysis_nikolasmoya_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("phrasebank_sentiment_analysis_nikolasmoya_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phrasebank_sentiment_analysis_nikolasmoya_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/nikolasmoya/phrasebank-sentiment-analysis + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-phrasebank_sentiment_analysis_richychn_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-phrasebank_sentiment_analysis_richychn_pipeline_en.md new file mode 100644 index 00000000000000..df2d7e745bd8ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-phrasebank_sentiment_analysis_richychn_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English phrasebank_sentiment_analysis_richychn_pipeline pipeline BertForSequenceClassification from richychn +author: John Snow Labs +name: phrasebank_sentiment_analysis_richychn_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phrasebank_sentiment_analysis_richychn_pipeline` is a English model originally trained by richychn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_richychn_pipeline_en_5.5.0_3.0_1727273113751.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_richychn_pipeline_en_5.5.0_3.0_1727273113751.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("phrasebank_sentiment_analysis_richychn_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("phrasebank_sentiment_analysis_richychn_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phrasebank_sentiment_analysis_richychn_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/richychn/phrasebank-sentiment-analysis + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-phrasebank_sentiment_analysis_saiteja_en.md b/docs/_posts/ahmedlone127/2024-09-25-phrasebank_sentiment_analysis_saiteja_en.md new file mode 100644 index 00000000000000..9a087ac8da4a88 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-phrasebank_sentiment_analysis_saiteja_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English phrasebank_sentiment_analysis_saiteja BertForSequenceClassification from Saiteja +author: John Snow Labs +name: phrasebank_sentiment_analysis_saiteja +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phrasebank_sentiment_analysis_saiteja` is a English model originally trained by Saiteja. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_saiteja_en_5.5.0_3.0_1727268429357.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_saiteja_en_5.5.0_3.0_1727268429357.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("phrasebank_sentiment_analysis_saiteja","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("phrasebank_sentiment_analysis_saiteja", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phrasebank_sentiment_analysis_saiteja| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Saiteja/phrasebank-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-phrasebank_sentiment_analysis_snrism_en.md b/docs/_posts/ahmedlone127/2024-09-25-phrasebank_sentiment_analysis_snrism_en.md new file mode 100644 index 00000000000000..4ef047c8dbfc71 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-phrasebank_sentiment_analysis_snrism_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English phrasebank_sentiment_analysis_snrism BertForSequenceClassification from snrism +author: John Snow Labs +name: phrasebank_sentiment_analysis_snrism +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phrasebank_sentiment_analysis_snrism` is a English model originally trained by snrism. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_snrism_en_5.5.0_3.0_1727301048302.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_snrism_en_5.5.0_3.0_1727301048302.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("phrasebank_sentiment_analysis_snrism","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("phrasebank_sentiment_analysis_snrism", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phrasebank_sentiment_analysis_snrism| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/snrism/phrasebank-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-pn6_800_en.md b/docs/_posts/ahmedlone127/2024-09-25-pn6_800_en.md new file mode 100644 index 00000000000000..a6965bb7dcbf84 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-pn6_800_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English pn6_800 BertForSequenceClassification from abbassix +author: John Snow Labs +name: pn6_800 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pn6_800` is a English model originally trained by abbassix. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pn6_800_en_5.5.0_3.0_1727290482384.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pn6_800_en_5.5.0_3.0_1727290482384.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("pn6_800","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("pn6_800", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pn6_800| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/abbassix/pn6_800 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-propoint_final_project_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-propoint_final_project_pipeline_en.md new file mode 100644 index 00000000000000..7c5db9f0ab43ac --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-propoint_final_project_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English propoint_final_project_pipeline pipeline BertForSequenceClassification from DataAngelo +author: John Snow Labs +name: propoint_final_project_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`propoint_final_project_pipeline` is a English model originally trained by DataAngelo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/propoint_final_project_pipeline_en_5.5.0_3.0_1727272695646.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/propoint_final_project_pipeline_en_5.5.0_3.0_1727272695646.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("propoint_final_project_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("propoint_final_project_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|propoint_final_project_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/DataAngelo/propoint_Final_project + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-psychbert_finetuned_mentalhealth_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-psychbert_finetuned_mentalhealth_pipeline_en.md new file mode 100644 index 00000000000000..400d99ffead8b8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-psychbert_finetuned_mentalhealth_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English psychbert_finetuned_mentalhealth_pipeline pipeline BertForSequenceClassification from mnaylor +author: John Snow Labs +name: psychbert_finetuned_mentalhealth_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`psychbert_finetuned_mentalhealth_pipeline` is a English model originally trained by mnaylor. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/psychbert_finetuned_mentalhealth_pipeline_en_5.5.0_3.0_1727257471563.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/psychbert_finetuned_mentalhealth_pipeline_en_5.5.0_3.0_1727257471563.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("psychbert_finetuned_mentalhealth_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("psychbert_finetuned_mentalhealth_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|psychbert_finetuned_mentalhealth_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/mnaylor/psychbert-finetuned-mentalhealth + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-q05_kaggle_bertbasecased_nsc_en.md b/docs/_posts/ahmedlone127/2024-09-25-q05_kaggle_bertbasecased_nsc_en.md new file mode 100644 index 00000000000000..0d97fbf7fe1ff2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-q05_kaggle_bertbasecased_nsc_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English q05_kaggle_bertbasecased_nsc BertForSequenceClassification from wallacenpj +author: John Snow Labs +name: q05_kaggle_bertbasecased_nsc +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`q05_kaggle_bertbasecased_nsc` is a English model originally trained by wallacenpj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/q05_kaggle_bertbasecased_nsc_en_5.5.0_3.0_1727288470225.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/q05_kaggle_bertbasecased_nsc_en_5.5.0_3.0_1727288470225.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("q05_kaggle_bertbasecased_nsc","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("q05_kaggle_bertbasecased_nsc", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|q05_kaggle_bertbasecased_nsc| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/wallacenpj/q05_kaggle_bertbasecased_nsc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-q05_kaggle_bertbasecased_nsc_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-q05_kaggle_bertbasecased_nsc_pipeline_en.md new file mode 100644 index 00000000000000..06d06798bb1563 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-q05_kaggle_bertbasecased_nsc_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English q05_kaggle_bertbasecased_nsc_pipeline pipeline BertForSequenceClassification from wallacenpj +author: John Snow Labs +name: q05_kaggle_bertbasecased_nsc_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`q05_kaggle_bertbasecased_nsc_pipeline` is a English model originally trained by wallacenpj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/q05_kaggle_bertbasecased_nsc_pipeline_en_5.5.0_3.0_1727288493551.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/q05_kaggle_bertbasecased_nsc_pipeline_en_5.5.0_3.0_1727288493551.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("q05_kaggle_bertbasecased_nsc_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("q05_kaggle_bertbasecased_nsc_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|q05_kaggle_bertbasecased_nsc_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/wallacenpj/q05_kaggle_bertbasecased_nsc + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-qa_bert_classifier_test_qa_classifier_full_training_set_pass_threshold_objective8_subjective6_en.md b/docs/_posts/ahmedlone127/2024-09-25-qa_bert_classifier_test_qa_classifier_full_training_set_pass_threshold_objective8_subjective6_en.md new file mode 100644 index 00000000000000..75a2454cd609d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-qa_bert_classifier_test_qa_classifier_full_training_set_pass_threshold_objective8_subjective6_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English qa_bert_classifier_test_qa_classifier_full_training_set_pass_threshold_objective8_subjective6 BertForSequenceClassification from rawpowertools +author: John Snow Labs +name: qa_bert_classifier_test_qa_classifier_full_training_set_pass_threshold_objective8_subjective6 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_bert_classifier_test_qa_classifier_full_training_set_pass_threshold_objective8_subjective6` is a English model originally trained by rawpowertools. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_bert_classifier_test_qa_classifier_full_training_set_pass_threshold_objective8_subjective6_en_5.5.0_3.0_1727295987475.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_bert_classifier_test_qa_classifier_full_training_set_pass_threshold_objective8_subjective6_en_5.5.0_3.0_1727295987475.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("qa_bert_classifier_test_qa_classifier_full_training_set_pass_threshold_objective8_subjective6","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("qa_bert_classifier_test_qa_classifier_full_training_set_pass_threshold_objective8_subjective6", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_bert_classifier_test_qa_classifier_full_training_set_pass_threshold_objective8_subjective6| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/rawpowertools/qa-bert-classifier-test_QA_classifier_FULL_training_set_pass_threshold_objective8_subjective6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-qa_bert_classifier_test_qa_classifier_full_training_set_pass_threshold_objective8_subjective6_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-qa_bert_classifier_test_qa_classifier_full_training_set_pass_threshold_objective8_subjective6_pipeline_en.md new file mode 100644 index 00000000000000..74a0f927236674 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-qa_bert_classifier_test_qa_classifier_full_training_set_pass_threshold_objective8_subjective6_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English qa_bert_classifier_test_qa_classifier_full_training_set_pass_threshold_objective8_subjective6_pipeline pipeline BertForSequenceClassification from rawpowertools +author: John Snow Labs +name: qa_bert_classifier_test_qa_classifier_full_training_set_pass_threshold_objective8_subjective6_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_bert_classifier_test_qa_classifier_full_training_set_pass_threshold_objective8_subjective6_pipeline` is a English model originally trained by rawpowertools. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_bert_classifier_test_qa_classifier_full_training_set_pass_threshold_objective8_subjective6_pipeline_en_5.5.0_3.0_1727296012872.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_bert_classifier_test_qa_classifier_full_training_set_pass_threshold_objective8_subjective6_pipeline_en_5.5.0_3.0_1727296012872.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("qa_bert_classifier_test_qa_classifier_full_training_set_pass_threshold_objective8_subjective6_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("qa_bert_classifier_test_qa_classifier_full_training_set_pass_threshold_objective8_subjective6_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_bert_classifier_test_qa_classifier_full_training_set_pass_threshold_objective8_subjective6_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/rawpowertools/qa-bert-classifier-test_QA_classifier_FULL_training_set_pass_threshold_objective8_subjective6 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-qd_dialog_bert_base_turkish_en.md b/docs/_posts/ahmedlone127/2024-09-25-qd_dialog_bert_base_turkish_en.md new file mode 100644 index 00000000000000..26fd93c2885c76 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-qd_dialog_bert_base_turkish_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English qd_dialog_bert_base_turkish BertForSequenceClassification from Izzet +author: John Snow Labs +name: qd_dialog_bert_base_turkish +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qd_dialog_bert_base_turkish` is a English model originally trained by Izzet. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qd_dialog_bert_base_turkish_en_5.5.0_3.0_1727291156296.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qd_dialog_bert_base_turkish_en_5.5.0_3.0_1727291156296.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("qd_dialog_bert_base_turkish","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("qd_dialog_bert_base_turkish", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qd_dialog_bert_base_turkish| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|691.2 MB| + +## References + +https://huggingface.co/Izzet/qd_dialog_bert-base-turkish \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-qd_dialog_bert_base_turkish_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-qd_dialog_bert_base_turkish_pipeline_en.md new file mode 100644 index 00000000000000..d7b84a38d691d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-qd_dialog_bert_base_turkish_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English qd_dialog_bert_base_turkish_pipeline pipeline BertForSequenceClassification from Izzet +author: John Snow Labs +name: qd_dialog_bert_base_turkish_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qd_dialog_bert_base_turkish_pipeline` is a English model originally trained by Izzet. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qd_dialog_bert_base_turkish_pipeline_en_5.5.0_3.0_1727291199507.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qd_dialog_bert_base_turkish_pipeline_en_5.5.0_3.0_1727291199507.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("qd_dialog_bert_base_turkish_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("qd_dialog_bert_base_turkish_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qd_dialog_bert_base_turkish_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|691.2 MB| + +## References + +https://huggingface.co/Izzet/qd_dialog_bert-base-turkish + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-quantum_neuraladaptivelearningsystem_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-quantum_neuraladaptivelearningsystem_pipeline_en.md new file mode 100644 index 00000000000000..9708d482b4724d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-quantum_neuraladaptivelearningsystem_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English quantum_neuraladaptivelearningsystem_pipeline pipeline BertForSequenceClassification from ayjays132 +author: John Snow Labs +name: quantum_neuraladaptivelearningsystem_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`quantum_neuraladaptivelearningsystem_pipeline` is a English model originally trained by ayjays132. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/quantum_neuraladaptivelearningsystem_pipeline_en_5.5.0_3.0_1727294042847.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/quantum_neuraladaptivelearningsystem_pipeline_en_5.5.0_3.0_1727294042847.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("quantum_neuraladaptivelearningsystem_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("quantum_neuraladaptivelearningsystem_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|quantum_neuraladaptivelearningsystem_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|410.1 MB| + +## References + +https://huggingface.co/ayjays132/Quantum-NeuralAdaptiveLearningSystem + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-regbert_2_en.md b/docs/_posts/ahmedlone127/2024-09-25-regbert_2_en.md new file mode 100644 index 00000000000000..e8c8469c35be6a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-regbert_2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English regbert_2 BertForSequenceClassification from Econlinguistics +author: John Snow Labs +name: regbert_2 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`regbert_2` is a English model originally trained by Econlinguistics. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/regbert_2_en_5.5.0_3.0_1727306342121.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/regbert_2_en_5.5.0_3.0_1727306342121.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("regbert_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("regbert_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|regbert_2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Econlinguistics/regbert_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-repml_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-repml_pipeline_en.md new file mode 100644 index 00000000000000..08db691c8649f7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-repml_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English repml_pipeline pipeline BertForSequenceClassification from MiBo +author: John Snow Labs +name: repml_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`repml_pipeline` is a English model originally trained by MiBo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/repml_pipeline_en_5.5.0_3.0_1727304509442.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/repml_pipeline_en_5.5.0_3.0_1727304509442.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("repml_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("repml_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|repml_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|414.9 MB| + +## References + +https://huggingface.co/MiBo/RepML + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-repurchase_train3_en.md b/docs/_posts/ahmedlone127/2024-09-25-repurchase_train3_en.md new file mode 100644 index 00000000000000..8c81035347e072 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-repurchase_train3_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English repurchase_train3 BertForSequenceClassification from laskovey +author: John Snow Labs +name: repurchase_train3 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`repurchase_train3` is a English model originally trained by laskovey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/repurchase_train3_en_5.5.0_3.0_1727288585593.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/repurchase_train3_en_5.5.0_3.0_1727288585593.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("repurchase_train3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("repurchase_train3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|repurchase_train3| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|109.5 MB| + +## References + +https://huggingface.co/laskovey/repurchase_train3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-repurchase_train3_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-repurchase_train3_pipeline_en.md new file mode 100644 index 00000000000000..75cd53b70d702c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-repurchase_train3_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English repurchase_train3_pipeline pipeline BertForSequenceClassification from laskovey +author: John Snow Labs +name: repurchase_train3_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`repurchase_train3_pipeline` is a English model originally trained by laskovey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/repurchase_train3_pipeline_en_5.5.0_3.0_1727288591084.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/repurchase_train3_pipeline_en_5.5.0_3.0_1727288591084.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("repurchase_train3_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("repurchase_train3_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|repurchase_train3_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|109.5 MB| + +## References + +https://huggingface.co/laskovey/repurchase_train3 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-roberta_cws_pku_en.md b/docs/_posts/ahmedlone127/2024-09-25-roberta_cws_pku_en.md new file mode 100644 index 00000000000000..1039af8e6efd89 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-roberta_cws_pku_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English roberta_cws_pku BertForTokenClassification from tjspross +author: John Snow Labs +name: roberta_cws_pku +date: 2024-09-25 +tags: [en, open_source, onnx, token_classification, bert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_cws_pku` is a English model originally trained by tjspross. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_cws_pku_en_5.5.0_3.0_1727265309107.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_cws_pku_en_5.5.0_3.0_1727265309107.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = BertForTokenClassification.pretrained("roberta_cws_pku","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("roberta_cws_pku", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_cws_pku| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/tjspross/roberta_cws_pku \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-rubert_tiny2_russe_toxicity_en.md b/docs/_posts/ahmedlone127/2024-09-25-rubert_tiny2_russe_toxicity_en.md new file mode 100644 index 00000000000000..80b7cc6febae6a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-rubert_tiny2_russe_toxicity_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English rubert_tiny2_russe_toxicity BertForSequenceClassification from BunnyNoBugs +author: John Snow Labs +name: rubert_tiny2_russe_toxicity +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_tiny2_russe_toxicity` is a English model originally trained by BunnyNoBugs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_tiny2_russe_toxicity_en_5.5.0_3.0_1727278792861.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_tiny2_russe_toxicity_en_5.5.0_3.0_1727278792861.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("rubert_tiny2_russe_toxicity","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rubert_tiny2_russe_toxicity", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_tiny2_russe_toxicity| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|109.5 MB| + +## References + +https://huggingface.co/BunnyNoBugs/rubert-tiny2-russe-toxicity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-rubert_tiny_custom_cross_encoder_en.md b/docs/_posts/ahmedlone127/2024-09-25-rubert_tiny_custom_cross_encoder_en.md new file mode 100644 index 00000000000000..20a1f002363d89 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-rubert_tiny_custom_cross_encoder_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English rubert_tiny_custom_cross_encoder BertForSequenceClassification from WpythonW +author: John Snow Labs +name: rubert_tiny_custom_cross_encoder +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_tiny_custom_cross_encoder` is a English model originally trained by WpythonW. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_tiny_custom_cross_encoder_en_5.5.0_3.0_1727292506402.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_tiny_custom_cross_encoder_en_5.5.0_3.0_1727292506402.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("rubert_tiny_custom_cross_encoder","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rubert_tiny_custom_cross_encoder", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_tiny_custom_cross_encoder| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|109.5 MB| + +## References + +https://huggingface.co/WpythonW/RUbert-tiny_custom_cross-encoder \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-rubert_tiny_custom_cross_encoder_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-rubert_tiny_custom_cross_encoder_pipeline_en.md new file mode 100644 index 00000000000000..d98b51491299b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-rubert_tiny_custom_cross_encoder_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English rubert_tiny_custom_cross_encoder_pipeline pipeline BertForSequenceClassification from WpythonW +author: John Snow Labs +name: rubert_tiny_custom_cross_encoder_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_tiny_custom_cross_encoder_pipeline` is a English model originally trained by WpythonW. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_tiny_custom_cross_encoder_pipeline_en_5.5.0_3.0_1727292511843.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_tiny_custom_cross_encoder_pipeline_en_5.5.0_3.0_1727292511843.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("rubert_tiny_custom_cross_encoder_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("rubert_tiny_custom_cross_encoder_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_tiny_custom_cross_encoder_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|109.5 MB| + +## References + +https://huggingface.co/WpythonW/RUbert-tiny_custom_cross-encoder + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-ruberttiny_multiclassv1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-ruberttiny_multiclassv1_pipeline_en.md new file mode 100644 index 00000000000000..04109f61d56590 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-ruberttiny_multiclassv1_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English ruberttiny_multiclassv1_pipeline pipeline BertForSequenceClassification from Shakhovak +author: John Snow Labs +name: ruberttiny_multiclassv1_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ruberttiny_multiclassv1_pipeline` is a English model originally trained by Shakhovak. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ruberttiny_multiclassv1_pipeline_en_5.5.0_3.0_1727261053098.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ruberttiny_multiclassv1_pipeline_en_5.5.0_3.0_1727261053098.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("ruberttiny_multiclassv1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("ruberttiny_multiclassv1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ruberttiny_multiclassv1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|109.6 MB| + +## References + +https://huggingface.co/Shakhovak/ruBertTiny_multiclassv1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-ruberttox_pipeline_ru.md b/docs/_posts/ahmedlone127/2024-09-25-ruberttox_pipeline_ru.md new file mode 100644 index 00000000000000..951d60e88a6952 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-ruberttox_pipeline_ru.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Russian ruberttox_pipeline pipeline BertForSequenceClassification from assskelad +author: John Snow Labs +name: ruberttox_pipeline +date: 2024-09-25 +tags: [ru, open_source, pipeline, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ruberttox_pipeline` is a Russian model originally trained by assskelad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ruberttox_pipeline_ru_5.5.0_3.0_1727292361043.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ruberttox_pipeline_ru_5.5.0_3.0_1727292361043.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("ruberttox_pipeline", lang = "ru") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("ruberttox_pipeline", lang = "ru") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ruberttox_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|ru| +|Size:|669.3 MB| + +## References + +https://huggingface.co/assskelad/ruBerttox + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-ruberttox_ru.md b/docs/_posts/ahmedlone127/2024-09-25-ruberttox_ru.md new file mode 100644 index 00000000000000..37591d2728c6f4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-ruberttox_ru.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Russian ruberttox BertForSequenceClassification from assskelad +author: John Snow Labs +name: ruberttox +date: 2024-09-25 +tags: [ru, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: ru +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ruberttox` is a Russian model originally trained by assskelad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ruberttox_ru_5.5.0_3.0_1727292322847.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ruberttox_ru_5.5.0_3.0_1727292322847.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("ruberttox","ru") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("ruberttox", "ru") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ruberttox| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|669.3 MB| + +## References + +https://huggingface.co/assskelad/ruBerttox \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-russscholar_seeker_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-russscholar_seeker_pipeline_en.md new file mode 100644 index 00000000000000..3248d1730d331f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-russscholar_seeker_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English russscholar_seeker_pipeline pipeline BertForSequenceClassification from Gao-Tianci +author: John Snow Labs +name: russscholar_seeker_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`russscholar_seeker_pipeline` is a English model originally trained by Gao-Tianci. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/russscholar_seeker_pipeline_en_5.5.0_3.0_1727263692392.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/russscholar_seeker_pipeline_en_5.5.0_3.0_1727263692392.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("russscholar_seeker_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("russscholar_seeker_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|russscholar_seeker_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Gao-Tianci/RussScholar-Seeker + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-sabert_spanish_fake_news_es.md b/docs/_posts/ahmedlone127/2024-09-25-sabert_spanish_fake_news_es.md new file mode 100644 index 00000000000000..b629b70a6bbd9f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-sabert_spanish_fake_news_es.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Castilian, Spanish sabert_spanish_fake_news BertForSequenceClassification from VerificadoProfesional +author: John Snow Labs +name: sabert_spanish_fake_news +date: 2024-09-25 +tags: [es, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: es +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sabert_spanish_fake_news` is a Castilian, Spanish model originally trained by VerificadoProfesional. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sabert_spanish_fake_news_es_5.5.0_3.0_1727308799765.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sabert_spanish_fake_news_es_5.5.0_3.0_1727308799765.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("sabert_spanish_fake_news","es") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sabert_spanish_fake_news", "es") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sabert_spanish_fake_news| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|411.9 MB| + +## References + +https://huggingface.co/VerificadoProfesional/SaBERT-Spanish-Fake-News \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-samagrabot_classifier_en.md b/docs/_posts/ahmedlone127/2024-09-25-samagrabot_classifier_en.md new file mode 100644 index 00000000000000..dc972a2c30c33c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-samagrabot_classifier_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English samagrabot_classifier BertForSequenceClassification from GautamR +author: John Snow Labs +name: samagrabot_classifier +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`samagrabot_classifier` is a English model originally trained by GautamR. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/samagrabot_classifier_en_5.5.0_3.0_1727298437464.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/samagrabot_classifier_en_5.5.0_3.0_1727298437464.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("samagrabot_classifier","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("samagrabot_classifier", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|samagrabot_classifier| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/GautamR/samagrabot_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-samagrabot_classifier_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-samagrabot_classifier_pipeline_en.md new file mode 100644 index 00000000000000..1ac1c69e045bbe --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-samagrabot_classifier_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English samagrabot_classifier_pipeline pipeline BertForSequenceClassification from GautamR +author: John Snow Labs +name: samagrabot_classifier_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`samagrabot_classifier_pipeline` is a English model originally trained by GautamR. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/samagrabot_classifier_pipeline_en_5.5.0_3.0_1727298459853.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/samagrabot_classifier_pipeline_en_5.5.0_3.0_1727298459853.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("samagrabot_classifier_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("samagrabot_classifier_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|samagrabot_classifier_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/GautamR/samagrabot_classifier + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-sbert_yelp2class_fast_en.md b/docs/_posts/ahmedlone127/2024-09-25-sbert_yelp2class_fast_en.md new file mode 100644 index 00000000000000..08775a0d0ef1f0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-sbert_yelp2class_fast_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sbert_yelp2class_fast BertForSequenceClassification from Siki-77 +author: John Snow Labs +name: sbert_yelp2class_fast +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sbert_yelp2class_fast` is a English model originally trained by Siki-77. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sbert_yelp2class_fast_en_5.5.0_3.0_1727308039270.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sbert_yelp2class_fast_en_5.5.0_3.0_1727308039270.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("sbert_yelp2class_fast","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sbert_yelp2class_fast", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sbert_yelp2class_fast| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|84.7 MB| + +## References + +https://huggingface.co/Siki-77/sbert_yelp2class_fast \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-sbert_yelp2class_fast_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-sbert_yelp2class_fast_pipeline_en.md new file mode 100644 index 00000000000000..0ce455925a1589 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-sbert_yelp2class_fast_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English sbert_yelp2class_fast_pipeline pipeline BertForSequenceClassification from Siki-77 +author: John Snow Labs +name: sbert_yelp2class_fast_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sbert_yelp2class_fast_pipeline` is a English model originally trained by Siki-77. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sbert_yelp2class_fast_pipeline_en_5.5.0_3.0_1727308044609.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sbert_yelp2class_fast_pipeline_en_5.5.0_3.0_1727308044609.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sbert_yelp2class_fast_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sbert_yelp2class_fast_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sbert_yelp2class_fast_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|84.8 MB| + +## References + +https://huggingface.co/Siki-77/sbert_yelp2class_fast + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-scitweets_scibert_en.md b/docs/_posts/ahmedlone127/2024-09-25-scitweets_scibert_en.md new file mode 100644 index 00000000000000..4096ff23b27ced --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-scitweets_scibert_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English scitweets_scibert BertForSequenceClassification from sschellhammer +author: John Snow Labs +name: scitweets_scibert +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scitweets_scibert` is a English model originally trained by sschellhammer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scitweets_scibert_en_5.5.0_3.0_1727294476130.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scitweets_scibert_en_5.5.0_3.0_1727294476130.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("scitweets_scibert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("scitweets_scibert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scitweets_scibert| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|412.2 MB| + +## References + +https://huggingface.co/sschellhammer/SciTweets_SciBert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-scitweets_scibert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-scitweets_scibert_pipeline_en.md new file mode 100644 index 00000000000000..8d0ddf76c5bf54 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-scitweets_scibert_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English scitweets_scibert_pipeline pipeline BertForSequenceClassification from sschellhammer +author: John Snow Labs +name: scitweets_scibert_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scitweets_scibert_pipeline` is a English model originally trained by sschellhammer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scitweets_scibert_pipeline_en_5.5.0_3.0_1727294497777.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scitweets_scibert_pipeline_en_5.5.0_3.0_1727294497777.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("scitweets_scibert_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("scitweets_scibert_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scitweets_scibert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|412.2 MB| + +## References + +https://huggingface.co/sschellhammer/SciTweets_SciBert + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-sead_l_6_h_256_a_8_mrpc_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-sead_l_6_h_256_a_8_mrpc_pipeline_en.md new file mode 100644 index 00000000000000..8f10dfca899ae7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-sead_l_6_h_256_a_8_mrpc_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English sead_l_6_h_256_a_8_mrpc_pipeline pipeline BertForSequenceClassification from C5i +author: John Snow Labs +name: sead_l_6_h_256_a_8_mrpc_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sead_l_6_h_256_a_8_mrpc_pipeline` is a English model originally trained by C5i. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sead_l_6_h_256_a_8_mrpc_pipeline_en_5.5.0_3.0_1727256758599.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sead_l_6_h_256_a_8_mrpc_pipeline_en_5.5.0_3.0_1727256758599.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sead_l_6_h_256_a_8_mrpc_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sead_l_6_h_256_a_8_mrpc_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sead_l_6_h_256_a_8_mrpc_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|47.3 MB| + +## References + +https://huggingface.co/C5i/SEAD-L-6_H-256_A-8-mrpc + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-sead_l_6_h_384_a_12_rte_en.md b/docs/_posts/ahmedlone127/2024-09-25-sead_l_6_h_384_a_12_rte_en.md new file mode 100644 index 00000000000000..2edbc0f6f4e25e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-sead_l_6_h_384_a_12_rte_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sead_l_6_h_384_a_12_rte BertForSequenceClassification from C5i +author: John Snow Labs +name: sead_l_6_h_384_a_12_rte +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sead_l_6_h_384_a_12_rte` is a English model originally trained by C5i. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sead_l_6_h_384_a_12_rte_en_5.5.0_3.0_1727304370305.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sead_l_6_h_384_a_12_rte_en_5.5.0_3.0_1727304370305.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("sead_l_6_h_384_a_12_rte","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sead_l_6_h_384_a_12_rte", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sead_l_6_h_384_a_12_rte| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|84.2 MB| + +## References + +https://huggingface.co/C5i/SEAD-L-6_H-384_A-12-rte \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-sent_bert_base_english_french_italian_cased_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-sent_bert_base_english_french_italian_cased_pipeline_en.md new file mode 100644 index 00000000000000..c2825d0b759969 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-sent_bert_base_english_french_italian_cased_pipeline_en.md @@ -0,0 +1,71 @@ +--- +layout: model +title: English sent_bert_base_english_french_italian_cased_pipeline pipeline BertSentenceEmbeddings from Geotrend +author: John Snow Labs +name: sent_bert_base_english_french_italian_cased_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_bert_base_english_french_italian_cased_pipeline` is a English model originally trained by Geotrend. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_bert_base_english_french_italian_cased_pipeline_en_5.5.0_3.0_1727248919780.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_bert_base_english_french_italian_cased_pipeline_en_5.5.0_3.0_1727248919780.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_bert_base_english_french_italian_cased_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_bert_base_english_french_italian_cased_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_bert_base_english_french_italian_cased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|428.7 MB| + +## References + +https://huggingface.co/Geotrend/bert-base-en-fr-it-cased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-sent_bert_base_english_swahili_cased_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-sent_bert_base_english_swahili_cased_pipeline_en.md new file mode 100644 index 00000000000000..2836114d2af2f6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-sent_bert_base_english_swahili_cased_pipeline_en.md @@ -0,0 +1,71 @@ +--- +layout: model +title: English sent_bert_base_english_swahili_cased_pipeline pipeline BertSentenceEmbeddings from Geotrend +author: John Snow Labs +name: sent_bert_base_english_swahili_cased_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_bert_base_english_swahili_cased_pipeline` is a English model originally trained by Geotrend. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_bert_base_english_swahili_cased_pipeline_en_5.5.0_3.0_1727252224766.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_bert_base_english_swahili_cased_pipeline_en_5.5.0_3.0_1727252224766.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_bert_base_english_swahili_cased_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_bert_base_english_swahili_cased_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_bert_base_english_swahili_cased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.3 MB| + +## References + +https://huggingface.co/Geotrend/bert-base-en-sw-cased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-sent_bert_base_english_urdu_cased_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-sent_bert_base_english_urdu_cased_pipeline_en.md new file mode 100644 index 00000000000000..091d9cf67108c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-sent_bert_base_english_urdu_cased_pipeline_en.md @@ -0,0 +1,71 @@ +--- +layout: model +title: English sent_bert_base_english_urdu_cased_pipeline pipeline BertSentenceEmbeddings from Geotrend +author: John Snow Labs +name: sent_bert_base_english_urdu_cased_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_bert_base_english_urdu_cased_pipeline` is a English model originally trained by Geotrend. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_bert_base_english_urdu_cased_pipeline_en_5.5.0_3.0_1727256690394.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_bert_base_english_urdu_cased_pipeline_en_5.5.0_3.0_1727256690394.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_bert_base_english_urdu_cased_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_bert_base_english_urdu_cased_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_bert_base_english_urdu_cased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|411.2 MB| + +## References + +https://huggingface.co/Geotrend/bert-base-en-ur-cased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-sent_medbit_pipeline_it.md b/docs/_posts/ahmedlone127/2024-09-25-sent_medbit_pipeline_it.md new file mode 100644 index 00000000000000..b91291f3b1fef9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-sent_medbit_pipeline_it.md @@ -0,0 +1,71 @@ +--- +layout: model +title: Italian sent_medbit_pipeline pipeline BertSentenceEmbeddings from IVN-RIN +author: John Snow Labs +name: sent_medbit_pipeline +date: 2024-09-25 +tags: [it, open_source, pipeline, onnx] +task: Embeddings +language: it +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_medbit_pipeline` is a Italian model originally trained by IVN-RIN. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_medbit_pipeline_it_5.5.0_3.0_1727248706567.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_medbit_pipeline_it_5.5.0_3.0_1727248706567.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_medbit_pipeline", lang = "it") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_medbit_pipeline", lang = "it") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_medbit_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|it| +|Size:|409.7 MB| + +## References + +https://huggingface.co/IVN-RIN/medBIT + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-sent_miem_scibert_linguistic_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-sent_miem_scibert_linguistic_pipeline_en.md new file mode 100644 index 00000000000000..0b9e1bb4839998 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-sent_miem_scibert_linguistic_pipeline_en.md @@ -0,0 +1,71 @@ +--- +layout: model +title: English sent_miem_scibert_linguistic_pipeline pipeline BertSentenceEmbeddings from miemBertProject +author: John Snow Labs +name: sent_miem_scibert_linguistic_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_miem_scibert_linguistic_pipeline` is a English model originally trained by miemBertProject. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_miem_scibert_linguistic_pipeline_en_5.5.0_3.0_1727249291091.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_miem_scibert_linguistic_pipeline_en_5.5.0_3.0_1727249291091.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_miem_scibert_linguistic_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_miem_scibert_linguistic_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_miem_scibert_linguistic_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|658.0 MB| + +## References + +https://huggingface.co/miemBertProject/miem-scibert-linguistic + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-sentiment_analysis_bert_based_model_en.md b/docs/_posts/ahmedlone127/2024-09-25-sentiment_analysis_bert_based_model_en.md new file mode 100644 index 00000000000000..1b8889b4fa6d5f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-sentiment_analysis_bert_based_model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sentiment_analysis_bert_based_model BertForSequenceClassification from GhylB +author: John Snow Labs +name: sentiment_analysis_bert_based_model +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_analysis_bert_based_model` is a English model originally trained by GhylB. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_analysis_bert_based_model_en_5.5.0_3.0_1727299614740.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_analysis_bert_based_model_en_5.5.0_3.0_1727299614740.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("sentiment_analysis_bert_based_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sentiment_analysis_bert_based_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_analysis_bert_based_model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/GhylB/Sentiment_Analysis_BERT_Based_MODEL \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-sentiment_analysis_bert_based_model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-sentiment_analysis_bert_based_model_pipeline_en.md new file mode 100644 index 00000000000000..e69172513cc127 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-sentiment_analysis_bert_based_model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English sentiment_analysis_bert_based_model_pipeline pipeline BertForSequenceClassification from GhylB +author: John Snow Labs +name: sentiment_analysis_bert_based_model_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_analysis_bert_based_model_pipeline` is a English model originally trained by GhylB. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_analysis_bert_based_model_pipeline_en_5.5.0_3.0_1727299637133.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_analysis_bert_based_model_pipeline_en_5.5.0_3.0_1727299637133.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sentiment_analysis_bert_based_model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sentiment_analysis_bert_based_model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_analysis_bert_based_model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/GhylB/Sentiment_Analysis_BERT_Based_MODEL + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-sentiment_analysis_trainer_model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-sentiment_analysis_trainer_model_pipeline_en.md new file mode 100644 index 00000000000000..3312503f7418e8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-sentiment_analysis_trainer_model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English sentiment_analysis_trainer_model_pipeline pipeline BertForSequenceClassification from benmanks +author: John Snow Labs +name: sentiment_analysis_trainer_model_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_analysis_trainer_model_pipeline` is a English model originally trained by benmanks. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_analysis_trainer_model_pipeline_en_5.5.0_3.0_1727306370734.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_analysis_trainer_model_pipeline_en_5.5.0_3.0_1727306370734.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sentiment_analysis_trainer_model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sentiment_analysis_trainer_model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_analysis_trainer_model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|466.4 MB| + +## References + +https://huggingface.co/benmanks/sentiment_analysis_trainer_model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-sentiment_classfication_distilbert_model_en.md b/docs/_posts/ahmedlone127/2024-09-25-sentiment_classfication_distilbert_model_en.md new file mode 100644 index 00000000000000..d6bc9996647a88 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-sentiment_classfication_distilbert_model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sentiment_classfication_distilbert_model BertForSequenceClassification from aaronayitey +author: John Snow Labs +name: sentiment_classfication_distilbert_model +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_classfication_distilbert_model` is a English model originally trained by aaronayitey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_classfication_distilbert_model_en_5.5.0_3.0_1727278258715.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_classfication_distilbert_model_en_5.5.0_3.0_1727278258715.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("sentiment_classfication_distilbert_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sentiment_classfication_distilbert_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_classfication_distilbert_model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/aaronayitey/Sentiment-classfication-distilBERT-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-sentiment_classfication_distilbert_model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-sentiment_classfication_distilbert_model_pipeline_en.md new file mode 100644 index 00000000000000..c5a4b18a0b7957 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-sentiment_classfication_distilbert_model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English sentiment_classfication_distilbert_model_pipeline pipeline BertForSequenceClassification from aaronayitey +author: John Snow Labs +name: sentiment_classfication_distilbert_model_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_classfication_distilbert_model_pipeline` is a English model originally trained by aaronayitey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_classfication_distilbert_model_pipeline_en_5.5.0_3.0_1727278279567.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_classfication_distilbert_model_pipeline_en_5.5.0_3.0_1727278279567.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sentiment_classfication_distilbert_model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sentiment_classfication_distilbert_model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_classfication_distilbert_model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/aaronayitey/Sentiment-classfication-distilBERT-model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-snli_w_premise_100k_1_epoch_en.md b/docs/_posts/ahmedlone127/2024-09-25-snli_w_premise_100k_1_epoch_en.md new file mode 100644 index 00000000000000..3f7a5d40a25759 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-snli_w_premise_100k_1_epoch_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English snli_w_premise_100k_1_epoch BertForSequenceClassification from grace-pro +author: John Snow Labs +name: snli_w_premise_100k_1_epoch +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`snli_w_premise_100k_1_epoch` is a English model originally trained by grace-pro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/snli_w_premise_100k_1_epoch_en_5.5.0_3.0_1727284544873.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/snli_w_premise_100k_1_epoch_en_5.5.0_3.0_1727284544873.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("snli_w_premise_100k_1_epoch","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("snli_w_premise_100k_1_epoch", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|snli_w_premise_100k_1_epoch| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/grace-pro/snli_w_premise_100k_1_epoch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-std_pnt_04_feather_berts_12_en.md b/docs/_posts/ahmedlone127/2024-09-25-std_pnt_04_feather_berts_12_en.md new file mode 100644 index 00000000000000..c429300874441c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-std_pnt_04_feather_berts_12_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English std_pnt_04_feather_berts_12 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: std_pnt_04_feather_berts_12 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`std_pnt_04_feather_berts_12` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/std_pnt_04_feather_berts_12_en_5.5.0_3.0_1727289543759.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/std_pnt_04_feather_berts_12_en_5.5.0_3.0_1727289543759.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("std_pnt_04_feather_berts_12","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("std_pnt_04_feather_berts_12", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|std_pnt_04_feather_berts_12| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/std_pnt_04_feather_berts-12 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-std_pnt_04_feather_berts_12_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-std_pnt_04_feather_berts_12_pipeline_en.md new file mode 100644 index 00000000000000..6ecacf1edff5d6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-std_pnt_04_feather_berts_12_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English std_pnt_04_feather_berts_12_pipeline pipeline BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: std_pnt_04_feather_berts_12_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`std_pnt_04_feather_berts_12_pipeline` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/std_pnt_04_feather_berts_12_pipeline_en_5.5.0_3.0_1727289565885.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/std_pnt_04_feather_berts_12_pipeline_en_5.5.0_3.0_1727289565885.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("std_pnt_04_feather_berts_12_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("std_pnt_04_feather_berts_12_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|std_pnt_04_feather_berts_12_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/std_pnt_04_feather_berts-12 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-sumto_fns2020_en.md b/docs/_posts/ahmedlone127/2024-09-25-sumto_fns2020_en.md new file mode 100644 index 00000000000000..7eaa7f963c9dd1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-sumto_fns2020_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sumto_fns2020 BertForSequenceClassification from morenolq +author: John Snow Labs +name: sumto_fns2020 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sumto_fns2020` is a English model originally trained by morenolq. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sumto_fns2020_en_5.5.0_3.0_1727307225319.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sumto_fns2020_en_5.5.0_3.0_1727307225319.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("sumto_fns2020","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sumto_fns2020", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sumto_fns2020| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.8 MB| + +## References + +https://huggingface.co/morenolq/SumTO_FNS2020 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-t_frex_bert_large_uncased_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-t_frex_bert_large_uncased_pipeline_en.md new file mode 100644 index 00000000000000..2128e9fbd13a8a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-t_frex_bert_large_uncased_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English t_frex_bert_large_uncased_pipeline pipeline BertForTokenClassification from quim-motger +author: John Snow Labs +name: t_frex_bert_large_uncased_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t_frex_bert_large_uncased_pipeline` is a English model originally trained by quim-motger. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t_frex_bert_large_uncased_pipeline_en_5.5.0_3.0_1727271831578.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t_frex_bert_large_uncased_pipeline_en_5.5.0_3.0_1727271831578.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("t_frex_bert_large_uncased_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("t_frex_bert_large_uncased_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t_frex_bert_large_uncased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/quim-motger/t-frex-bert-large-uncased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-test_ner_rundi_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-test_ner_rundi_pipeline_en.md new file mode 100644 index 00000000000000..bf0c6febc6db08 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-test_ner_rundi_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English test_ner_rundi_pipeline pipeline BertForTokenClassification from lltala +author: John Snow Labs +name: test_ner_rundi_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_ner_rundi_pipeline` is a English model originally trained by lltala. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_ner_rundi_pipeline_en_5.5.0_3.0_1727283646883.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_ner_rundi_pipeline_en_5.5.0_3.0_1727283646883.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("test_ner_rundi_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("test_ner_rundi_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_ner_rundi_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/lltala/test-ner-run + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-test_trainer_alerams_en.md b/docs/_posts/ahmedlone127/2024-09-25-test_trainer_alerams_en.md new file mode 100644 index 00000000000000..0728a06f9747d6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-test_trainer_alerams_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English test_trainer_alerams BertForSequenceClassification from AleRams +author: John Snow Labs +name: test_trainer_alerams +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_trainer_alerams` is a English model originally trained by AleRams. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_trainer_alerams_en_5.5.0_3.0_1727289711818.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_trainer_alerams_en_5.5.0_3.0_1727289711818.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("test_trainer_alerams","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("test_trainer_alerams", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_trainer_alerams| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/AleRams/test-trainer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-test_trainer_lsb_en.md b/docs/_posts/ahmedlone127/2024-09-25-test_trainer_lsb_en.md new file mode 100644 index 00000000000000..408290ca38096e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-test_trainer_lsb_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English test_trainer_lsb BertForSequenceClassification from lsb +author: John Snow Labs +name: test_trainer_lsb +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_trainer_lsb` is a English model originally trained by lsb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_trainer_lsb_en_5.5.0_3.0_1727291626463.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_trainer_lsb_en_5.5.0_3.0_1727291626463.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("test_trainer_lsb","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("test_trainer_lsb", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_trainer_lsb| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/lsb/test_trainer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-test_trainer_lsb_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-test_trainer_lsb_pipeline_en.md new file mode 100644 index 00000000000000..68bbdfefecb2da --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-test_trainer_lsb_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English test_trainer_lsb_pipeline pipeline BertForSequenceClassification from lsb +author: John Snow Labs +name: test_trainer_lsb_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_trainer_lsb_pipeline` is a English model originally trained by lsb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_trainer_lsb_pipeline_en_5.5.0_3.0_1727291659578.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_trainer_lsb_pipeline_en_5.5.0_3.0_1727291659578.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("test_trainer_lsb_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("test_trainer_lsb_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_trainer_lsb_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/lsb/test_trainer + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-testing1_en.md b/docs/_posts/ahmedlone127/2024-09-25-testing1_en.md new file mode 100644 index 00000000000000..01d036c14c0376 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-testing1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English testing1 BertForSequenceClassification from SpicyCorpse +author: John Snow Labs +name: testing1 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`testing1` is a English model originally trained by SpicyCorpse. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/testing1_en_5.5.0_3.0_1727308278240.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/testing1_en_5.5.0_3.0_1727308278240.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("testing1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("testing1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|testing1| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/SpicyCorpse/testing1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-testing1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-testing1_pipeline_en.md new file mode 100644 index 00000000000000..fd19079f4d50d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-testing1_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English testing1_pipeline pipeline BertForSequenceClassification from SpicyCorpse +author: John Snow Labs +name: testing1_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`testing1_pipeline` is a English model originally trained by SpicyCorpse. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/testing1_pipeline_en_5.5.0_3.0_1727308299627.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/testing1_pipeline_en_5.5.0_3.0_1727308299627.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("testing1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("testing1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|testing1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/SpicyCorpse/testing1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-text_classification_medical_en.md b/docs/_posts/ahmedlone127/2024-09-25-text_classification_medical_en.md new file mode 100644 index 00000000000000..329f0c4c36c931 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-text_classification_medical_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English text_classification_medical BertForSequenceClassification from felipe-nextly +author: John Snow Labs +name: text_classification_medical +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`text_classification_medical` is a English model originally trained by felipe-nextly. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/text_classification_medical_en_5.5.0_3.0_1727295686217.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/text_classification_medical_en_5.5.0_3.0_1727295686217.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("text_classification_medical","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("text_classification_medical", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|text_classification_medical| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/felipe-nextly/text-classification-medical \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-tiny_bert_sentiment_persian_fa.md b/docs/_posts/ahmedlone127/2024-09-25-tiny_bert_sentiment_persian_fa.md new file mode 100644 index 00000000000000..7a62c0d0137bd6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-tiny_bert_sentiment_persian_fa.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Persian tiny_bert_sentiment_persian BertForSequenceClassification from dadashzadeh +author: John Snow Labs +name: tiny_bert_sentiment_persian +date: 2024-09-25 +tags: [fa, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: fa +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_bert_sentiment_persian` is a Persian model originally trained by dadashzadeh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_bert_sentiment_persian_fa_5.5.0_3.0_1727292676299.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_bert_sentiment_persian_fa_5.5.0_3.0_1727292676299.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("tiny_bert_sentiment_persian","fa") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("tiny_bert_sentiment_persian", "fa") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_bert_sentiment_persian| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|fa| +|Size:|16.7 MB| + +## References + +https://huggingface.co/dadashzadeh/tiny-bert-Sentiment-persian \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-tiny_bert_sentiment_persian_pipeline_fa.md b/docs/_posts/ahmedlone127/2024-09-25-tiny_bert_sentiment_persian_pipeline_fa.md new file mode 100644 index 00000000000000..467e240e320775 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-tiny_bert_sentiment_persian_pipeline_fa.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Persian tiny_bert_sentiment_persian_pipeline pipeline BertForSequenceClassification from dadashzadeh +author: John Snow Labs +name: tiny_bert_sentiment_persian_pipeline +date: 2024-09-25 +tags: [fa, open_source, pipeline, onnx] +task: Text Classification +language: fa +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_bert_sentiment_persian_pipeline` is a Persian model originally trained by dadashzadeh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_bert_sentiment_persian_pipeline_fa_5.5.0_3.0_1727292677577.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_bert_sentiment_persian_pipeline_fa_5.5.0_3.0_1727292677577.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("tiny_bert_sentiment_persian_pipeline", lang = "fa") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("tiny_bert_sentiment_persian_pipeline", lang = "fa") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_bert_sentiment_persian_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|fa| +|Size:|16.7 MB| + +## References + +https://huggingface.co/dadashzadeh/tiny-bert-Sentiment-persian + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-tiny_bert_sst2_distilled_clone_en.md b/docs/_posts/ahmedlone127/2024-09-25-tiny_bert_sst2_distilled_clone_en.md new file mode 100644 index 00000000000000..7dc68c682dfa8f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-tiny_bert_sst2_distilled_clone_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English tiny_bert_sst2_distilled_clone BertForSequenceClassification from fxmarty +author: John Snow Labs +name: tiny_bert_sst2_distilled_clone +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_bert_sst2_distilled_clone` is a English model originally trained by fxmarty. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_bert_sst2_distilled_clone_en_5.5.0_3.0_1727273259585.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_bert_sst2_distilled_clone_en_5.5.0_3.0_1727273259585.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("tiny_bert_sst2_distilled_clone","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("tiny_bert_sst2_distilled_clone", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_bert_sst2_distilled_clone| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/fxmarty/tiny-bert-sst2-distilled-clone \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-tiny_bert_sst2_distilled_clone_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-tiny_bert_sst2_distilled_clone_pipeline_en.md new file mode 100644 index 00000000000000..09fd97868e6b98 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-tiny_bert_sst2_distilled_clone_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English tiny_bert_sst2_distilled_clone_pipeline pipeline BertForSequenceClassification from fxmarty +author: John Snow Labs +name: tiny_bert_sst2_distilled_clone_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_bert_sst2_distilled_clone_pipeline` is a English model originally trained by fxmarty. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_bert_sst2_distilled_clone_pipeline_en_5.5.0_3.0_1727273260802.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_bert_sst2_distilled_clone_pipeline_en_5.5.0_3.0_1727273260802.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("tiny_bert_sst2_distilled_clone_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("tiny_bert_sst2_distilled_clone_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_bert_sst2_distilled_clone_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|16.8 MB| + +## References + +https://huggingface.co/fxmarty/tiny-bert-sst2-distilled-clone + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-toxic_dbmdz_bert_base_turkish_128k_uncased_fully_unbiased_en.md b/docs/_posts/ahmedlone127/2024-09-25-toxic_dbmdz_bert_base_turkish_128k_uncased_fully_unbiased_en.md new file mode 100644 index 00000000000000..a9d598246db9cf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-toxic_dbmdz_bert_base_turkish_128k_uncased_fully_unbiased_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English toxic_dbmdz_bert_base_turkish_128k_uncased_fully_unbiased BertForSequenceClassification from l2reg +author: John Snow Labs +name: toxic_dbmdz_bert_base_turkish_128k_uncased_fully_unbiased +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`toxic_dbmdz_bert_base_turkish_128k_uncased_fully_unbiased` is a English model originally trained by l2reg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/toxic_dbmdz_bert_base_turkish_128k_uncased_fully_unbiased_en_5.5.0_3.0_1727304380452.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/toxic_dbmdz_bert_base_turkish_128k_uncased_fully_unbiased_en_5.5.0_3.0_1727304380452.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("toxic_dbmdz_bert_base_turkish_128k_uncased_fully_unbiased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("toxic_dbmdz_bert_base_turkish_128k_uncased_fully_unbiased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|toxic_dbmdz_bert_base_turkish_128k_uncased_fully_unbiased| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|691.6 MB| + +## References + +https://huggingface.co/l2reg/toxic-dbmdz-bert-base-turkish-128k-uncased-fully-unbiased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-toxic_dbmdz_bert_base_turkish_128k_uncased_fully_unbiased_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-toxic_dbmdz_bert_base_turkish_128k_uncased_fully_unbiased_pipeline_en.md new file mode 100644 index 00000000000000..72e13e42162832 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-toxic_dbmdz_bert_base_turkish_128k_uncased_fully_unbiased_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English toxic_dbmdz_bert_base_turkish_128k_uncased_fully_unbiased_pipeline pipeline BertForSequenceClassification from l2reg +author: John Snow Labs +name: toxic_dbmdz_bert_base_turkish_128k_uncased_fully_unbiased_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`toxic_dbmdz_bert_base_turkish_128k_uncased_fully_unbiased_pipeline` is a English model originally trained by l2reg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/toxic_dbmdz_bert_base_turkish_128k_uncased_fully_unbiased_pipeline_en_5.5.0_3.0_1727304416836.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/toxic_dbmdz_bert_base_turkish_128k_uncased_fully_unbiased_pipeline_en_5.5.0_3.0_1727304416836.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("toxic_dbmdz_bert_base_turkish_128k_uncased_fully_unbiased_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("toxic_dbmdz_bert_base_turkish_128k_uncased_fully_unbiased_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|toxic_dbmdz_bert_base_turkish_128k_uncased_fully_unbiased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|691.6 MB| + +## References + +https://huggingface.co/l2reg/toxic-dbmdz-bert-base-turkish-128k-uncased-fully-unbiased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-toxic_hubert_pipeline_hu.md b/docs/_posts/ahmedlone127/2024-09-25-toxic_hubert_pipeline_hu.md new file mode 100644 index 00000000000000..fe456d0aebeb52 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-toxic_hubert_pipeline_hu.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Hungarian toxic_hubert_pipeline pipeline BertForSequenceClassification from RabidUmarell +author: John Snow Labs +name: toxic_hubert_pipeline +date: 2024-09-25 +tags: [hu, open_source, pipeline, onnx] +task: Text Classification +language: hu +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`toxic_hubert_pipeline` is a Hungarian model originally trained by RabidUmarell. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/toxic_hubert_pipeline_hu_5.5.0_3.0_1727300490850.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/toxic_hubert_pipeline_hu_5.5.0_3.0_1727300490850.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("toxic_hubert_pipeline", lang = "hu") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("toxic_hubert_pipeline", lang = "hu") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|toxic_hubert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|hu| +|Size:|414.7 MB| + +## References + +https://huggingface.co/RabidUmarell/toxic-hubert + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-toxicity_analyzer_en.md b/docs/_posts/ahmedlone127/2024-09-25-toxicity_analyzer_en.md new file mode 100644 index 00000000000000..4f3a84d00f6266 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-toxicity_analyzer_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English toxicity_analyzer BertForSequenceClassification from Vlad1m +author: John Snow Labs +name: toxicity_analyzer +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`toxicity_analyzer` is a English model originally trained by Vlad1m. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/toxicity_analyzer_en_5.5.0_3.0_1727300948025.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/toxicity_analyzer_en_5.5.0_3.0_1727300948025.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("toxicity_analyzer","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("toxicity_analyzer", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|toxicity_analyzer| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|664.4 MB| + +## References + +https://huggingface.co/Vlad1m/toxicity_analyzer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-toxicity_analyzer_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-toxicity_analyzer_pipeline_en.md new file mode 100644 index 00000000000000..3a5d331bcb8be1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-toxicity_analyzer_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English toxicity_analyzer_pipeline pipeline BertForSequenceClassification from Vlad1m +author: John Snow Labs +name: toxicity_analyzer_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`toxicity_analyzer_pipeline` is a English model originally trained by Vlad1m. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/toxicity_analyzer_pipeline_en_5.5.0_3.0_1727300984723.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/toxicity_analyzer_pipeline_en_5.5.0_3.0_1727300984723.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("toxicity_analyzer_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("toxicity_analyzer_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|toxicity_analyzer_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|664.5 MB| + +## References + +https://huggingface.co/Vlad1m/toxicity_analyzer + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-trac2020_all_a_bert_base_multilingual_uncased_xx.md b/docs/_posts/ahmedlone127/2024-09-25-trac2020_all_a_bert_base_multilingual_uncased_xx.md new file mode 100644 index 00000000000000..87904188c07702 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-trac2020_all_a_bert_base_multilingual_uncased_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual trac2020_all_a_bert_base_multilingual_uncased BertForSequenceClassification from socialmediaie +author: John Snow Labs +name: trac2020_all_a_bert_base_multilingual_uncased +date: 2024-09-25 +tags: [xx, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`trac2020_all_a_bert_base_multilingual_uncased` is a Multilingual model originally trained by socialmediaie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/trac2020_all_a_bert_base_multilingual_uncased_xx_5.5.0_3.0_1727305662508.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/trac2020_all_a_bert_base_multilingual_uncased_xx_5.5.0_3.0_1727305662508.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("trac2020_all_a_bert_base_multilingual_uncased","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("trac2020_all_a_bert_base_multilingual_uncased", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|trac2020_all_a_bert_base_multilingual_uncased| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|627.7 MB| + +## References + +https://huggingface.co/socialmediaie/TRAC2020_ALL_A_bert-base-multilingual-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-trac2020_hin_a_bert_base_multilingual_uncased_xx.md b/docs/_posts/ahmedlone127/2024-09-25-trac2020_hin_a_bert_base_multilingual_uncased_xx.md new file mode 100644 index 00000000000000..f74f3028fbe24b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-trac2020_hin_a_bert_base_multilingual_uncased_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual trac2020_hin_a_bert_base_multilingual_uncased BertForSequenceClassification from socialmediaie +author: John Snow Labs +name: trac2020_hin_a_bert_base_multilingual_uncased +date: 2024-09-25 +tags: [xx, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`trac2020_hin_a_bert_base_multilingual_uncased` is a Multilingual model originally trained by socialmediaie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/trac2020_hin_a_bert_base_multilingual_uncased_xx_5.5.0_3.0_1727305052640.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/trac2020_hin_a_bert_base_multilingual_uncased_xx_5.5.0_3.0_1727305052640.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("trac2020_hin_a_bert_base_multilingual_uncased","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("trac2020_hin_a_bert_base_multilingual_uncased", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|trac2020_hin_a_bert_base_multilingual_uncased| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|627.7 MB| + +## References + +https://huggingface.co/socialmediaie/TRAC2020_HIN_A_bert-base-multilingual-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-turkish_tiny_bert_uncased_offenseval2020_turkish_pipeline_tr.md b/docs/_posts/ahmedlone127/2024-09-25-turkish_tiny_bert_uncased_offenseval2020_turkish_pipeline_tr.md new file mode 100644 index 00000000000000..c89437bfe5c460 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-turkish_tiny_bert_uncased_offenseval2020_turkish_pipeline_tr.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Turkish turkish_tiny_bert_uncased_offenseval2020_turkish_pipeline pipeline BertForSequenceClassification from atasoglu +author: John Snow Labs +name: turkish_tiny_bert_uncased_offenseval2020_turkish_pipeline +date: 2024-09-25 +tags: [tr, open_source, pipeline, onnx] +task: Text Classification +language: tr +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`turkish_tiny_bert_uncased_offenseval2020_turkish_pipeline` is a Turkish model originally trained by atasoglu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/turkish_tiny_bert_uncased_offenseval2020_turkish_pipeline_tr_5.5.0_3.0_1727287836537.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/turkish_tiny_bert_uncased_offenseval2020_turkish_pipeline_tr_5.5.0_3.0_1727287836537.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("turkish_tiny_bert_uncased_offenseval2020_turkish_pipeline", lang = "tr") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("turkish_tiny_bert_uncased_offenseval2020_turkish_pipeline", lang = "tr") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|turkish_tiny_bert_uncased_offenseval2020_turkish_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|tr| +|Size:|17.5 MB| + +## References + +https://huggingface.co/atasoglu/turkish-tiny-bert-uncased-offenseval2020_tr + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-twitter_bert_base_sentiment_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-twitter_bert_base_sentiment_pipeline_en.md new file mode 100644 index 00000000000000..2d7d75ed242b4c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-twitter_bert_base_sentiment_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English twitter_bert_base_sentiment_pipeline pipeline BertForSequenceClassification from jonathanybema +author: John Snow Labs +name: twitter_bert_base_sentiment_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_bert_base_sentiment_pipeline` is a English model originally trained by jonathanybema. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_bert_base_sentiment_pipeline_en_5.5.0_3.0_1727284674397.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_bert_base_sentiment_pipeline_en_5.5.0_3.0_1727284674397.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("twitter_bert_base_sentiment_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("twitter_bert_base_sentiment_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_bert_base_sentiment_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jonathanybema/twitter-bert-base-sentiment + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-twitter_data_bert_base_multilingual_uncased_hindi_only_memes_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-09-25-twitter_data_bert_base_multilingual_uncased_hindi_only_memes_pipeline_xx.md new file mode 100644 index 00000000000000..df4edfaa5d38ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-twitter_data_bert_base_multilingual_uncased_hindi_only_memes_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual twitter_data_bert_base_multilingual_uncased_hindi_only_memes_pipeline pipeline BertForSequenceClassification from SiddharthaM +author: John Snow Labs +name: twitter_data_bert_base_multilingual_uncased_hindi_only_memes_pipeline +date: 2024-09-25 +tags: [xx, open_source, pipeline, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_data_bert_base_multilingual_uncased_hindi_only_memes_pipeline` is a Multilingual model originally trained by SiddharthaM. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_data_bert_base_multilingual_uncased_hindi_only_memes_pipeline_xx_5.5.0_3.0_1727289343244.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_data_bert_base_multilingual_uncased_hindi_only_memes_pipeline_xx_5.5.0_3.0_1727289343244.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("twitter_data_bert_base_multilingual_uncased_hindi_only_memes_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("twitter_data_bert_base_multilingual_uncased_hindi_only_memes_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_data_bert_base_multilingual_uncased_hindi_only_memes_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|627.8 MB| + +## References + +https://huggingface.co/SiddharthaM/twitter-data-bert-base-multilingual-uncased-hindi-only-memes + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-twitter_data_bert_base_multilingual_uncased_hindi_only_memes_xx.md b/docs/_posts/ahmedlone127/2024-09-25-twitter_data_bert_base_multilingual_uncased_hindi_only_memes_xx.md new file mode 100644 index 00000000000000..333540d41b44bb --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-twitter_data_bert_base_multilingual_uncased_hindi_only_memes_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual twitter_data_bert_base_multilingual_uncased_hindi_only_memes BertForSequenceClassification from SiddharthaM +author: John Snow Labs +name: twitter_data_bert_base_multilingual_uncased_hindi_only_memes +date: 2024-09-25 +tags: [xx, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_data_bert_base_multilingual_uncased_hindi_only_memes` is a Multilingual model originally trained by SiddharthaM. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_data_bert_base_multilingual_uncased_hindi_only_memes_xx_5.5.0_3.0_1727289304564.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_data_bert_base_multilingual_uncased_hindi_only_memes_xx_5.5.0_3.0_1727289304564.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("twitter_data_bert_base_multilingual_uncased_hindi_only_memes","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("twitter_data_bert_base_multilingual_uncased_hindi_only_memes", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_data_bert_base_multilingual_uncased_hindi_only_memes| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|627.7 MB| + +## References + +https://huggingface.co/SiddharthaM/twitter-data-bert-base-multilingual-uncased-hindi-only-memes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-vaccinchatsentenceclassifierdutch_frombertje2_dadialog_en.md b/docs/_posts/ahmedlone127/2024-09-25-vaccinchatsentenceclassifierdutch_frombertje2_dadialog_en.md new file mode 100644 index 00000000000000..b22a24ac0d3635 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-vaccinchatsentenceclassifierdutch_frombertje2_dadialog_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English vaccinchatsentenceclassifierdutch_frombertje2_dadialog BertForSequenceClassification from Jeska +author: John Snow Labs +name: vaccinchatsentenceclassifierdutch_frombertje2_dadialog +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`vaccinchatsentenceclassifierdutch_frombertje2_dadialog` is a English model originally trained by Jeska. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/vaccinchatsentenceclassifierdutch_frombertje2_dadialog_en_5.5.0_3.0_1727268336220.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/vaccinchatsentenceclassifierdutch_frombertje2_dadialog_en_5.5.0_3.0_1727268336220.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("vaccinchatsentenceclassifierdutch_frombertje2_dadialog","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("vaccinchatsentenceclassifierdutch_frombertje2_dadialog", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|vaccinchatsentenceclassifierdutch_frombertje2_dadialog| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/Jeska/VaccinChatSentenceClassifierDutch_fromBERTje2_DAdialog \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-vinilm_2021_qa_evaluator_en.md b/docs/_posts/ahmedlone127/2024-09-25-vinilm_2021_qa_evaluator_en.md new file mode 100644 index 00000000000000..12ebb62d7e063c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-vinilm_2021_qa_evaluator_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English vinilm_2021_qa_evaluator BertForSequenceClassification from VMware +author: John Snow Labs +name: vinilm_2021_qa_evaluator +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`vinilm_2021_qa_evaluator` is a English model originally trained by VMware. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/vinilm_2021_qa_evaluator_en_5.5.0_3.0_1727290732614.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/vinilm_2021_qa_evaluator_en_5.5.0_3.0_1727290732614.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("vinilm_2021_qa_evaluator","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("vinilm_2021_qa_evaluator", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|vinilm_2021_qa_evaluator| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|250.5 MB| + +## References + +https://huggingface.co/VMware/vinilm-2021-qa-evaluator \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-wuwenbin_test_bert_base_uncased_mrpc_en.md b/docs/_posts/ahmedlone127/2024-09-25-wuwenbin_test_bert_base_uncased_mrpc_en.md new file mode 100644 index 00000000000000..fb7a38d8811afe --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-wuwenbin_test_bert_base_uncased_mrpc_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English wuwenbin_test_bert_base_uncased_mrpc BertForSequenceClassification from RainboWu +author: John Snow Labs +name: wuwenbin_test_bert_base_uncased_mrpc +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wuwenbin_test_bert_base_uncased_mrpc` is a English model originally trained by RainboWu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wuwenbin_test_bert_base_uncased_mrpc_en_5.5.0_3.0_1727278763327.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wuwenbin_test_bert_base_uncased_mrpc_en_5.5.0_3.0_1727278763327.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("wuwenbin_test_bert_base_uncased_mrpc","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("wuwenbin_test_bert_base_uncased_mrpc", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wuwenbin_test_bert_base_uncased_mrpc| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/RainboWu/wuwenbin-test-bert-base-uncased-mrpc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-wuwenbin_test_bert_base_uncased_mrpc_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-wuwenbin_test_bert_base_uncased_mrpc_pipeline_en.md new file mode 100644 index 00000000000000..9dc12167d5a3db --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-wuwenbin_test_bert_base_uncased_mrpc_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English wuwenbin_test_bert_base_uncased_mrpc_pipeline pipeline BertForSequenceClassification from RainboWu +author: John Snow Labs +name: wuwenbin_test_bert_base_uncased_mrpc_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wuwenbin_test_bert_base_uncased_mrpc_pipeline` is a English model originally trained by RainboWu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wuwenbin_test_bert_base_uncased_mrpc_pipeline_en_5.5.0_3.0_1727278786445.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wuwenbin_test_bert_base_uncased_mrpc_pipeline_en_5.5.0_3.0_1727278786445.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("wuwenbin_test_bert_base_uncased_mrpc_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("wuwenbin_test_bert_base_uncased_mrpc_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wuwenbin_test_bert_base_uncased_mrpc_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/RainboWu/wuwenbin-test-bert-base-uncased-mrpc + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-xlm_roberta_base_finetuned_misogyny_sexism_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-25-xlm_roberta_base_finetuned_misogyny_sexism_pipeline_en.md new file mode 100644 index 00000000000000..d28722aa2e8055 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-xlm_roberta_base_finetuned_misogyny_sexism_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_misogyny_sexism_pipeline pipeline XlmRoBertaForSequenceClassification from annahaz +author: John Snow Labs +name: xlm_roberta_base_finetuned_misogyny_sexism_pipeline +date: 2024-09-25 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_misogyny_sexism_pipeline` is a English model originally trained by annahaz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_misogyny_sexism_pipeline_en_5.5.0_3.0_1727229240510.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_misogyny_sexism_pipeline_en_5.5.0_3.0_1727229240510.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_misogyny_sexism_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_misogyny_sexism_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_misogyny_sexism_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|877.4 MB| + +## References + +https://huggingface.co/annahaz/xlm-roberta-base-finetuned-misogyny-sexism + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-yahoo2_en.md b/docs/_posts/ahmedlone127/2024-09-25-yahoo2_en.md new file mode 100644 index 00000000000000..014352a5c965f8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-yahoo2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English yahoo2 BertForSequenceClassification from Lumos +author: John Snow Labs +name: yahoo2 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`yahoo2` is a English model originally trained by Lumos. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/yahoo2_en_5.5.0_3.0_1727287204052.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/yahoo2_en_5.5.0_3.0_1727287204052.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("yahoo2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("yahoo2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|yahoo2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Lumos/yahoo2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-25-zidan_model_output_v4_en.md b/docs/_posts/ahmedlone127/2024-09-25-zidan_model_output_v4_en.md new file mode 100644 index 00000000000000..5f38fad2008c2f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-25-zidan_model_output_v4_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English zidan_model_output_v4 BertForSequenceClassification from ZidanAf +author: John Snow Labs +name: zidan_model_output_v4 +date: 2024-09-25 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`zidan_model_output_v4` is a English model originally trained by ZidanAf. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/zidan_model_output_v4_en_5.5.0_3.0_1727268472265.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/zidan_model_output_v4_en_5.5.0_3.0_1727268472265.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("zidan_model_output_v4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("zidan_model_output_v4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|zidan_model_output_v4| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|413.9 MB| + +## References + +https://huggingface.co/ZidanAf/Zidan_model_output_v4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-11_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-11_pipeline_en.md new file mode 100644 index 00000000000000..94b15c75fa4057 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-11_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English 11_pipeline pipeline BertForSequenceClassification from hyeonddu +author: John Snow Labs +name: 11_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`11_pipeline` is a English model originally trained by hyeonddu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/11_pipeline_en_5.5.0_3.0_1727338613066.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/11_pipeline_en_5.5.0_3.0_1727338613066.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("11_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("11_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|11_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/hyeonddu/11 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-2d6_1600_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-2d6_1600_pipeline_en.md new file mode 100644 index 00000000000000..c0124a59dcb414 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-2d6_1600_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English 2d6_1600_pipeline pipeline BertForSequenceClassification from abbassix +author: John Snow Labs +name: 2d6_1600_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`2d6_1600_pipeline` is a English model originally trained by abbassix. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/2d6_1600_pipeline_en_5.5.0_3.0_1727341302416.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/2d6_1600_pipeline_en_5.5.0_3.0_1727341302416.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("2d6_1600_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("2d6_1600_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|2d6_1600_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/abbassix/2d6_1600 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-2d_oomv1_800_en.md b/docs/_posts/ahmedlone127/2024-09-26-2d_oomv1_800_en.md new file mode 100644 index 00000000000000..1580ba7e9c5c60 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-2d_oomv1_800_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English 2d_oomv1_800 BertForSequenceClassification from abbassix +author: John Snow Labs +name: 2d_oomv1_800 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`2d_oomv1_800` is a English model originally trained by abbassix. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/2d_oomv1_800_en_5.5.0_3.0_1727351412578.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/2d_oomv1_800_en_5.5.0_3.0_1727351412578.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("2d_oomv1_800","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("2d_oomv1_800", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|2d_oomv1_800| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/abbassix/2d_oomv1_800 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-2d_oomv1_800_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-2d_oomv1_800_pipeline_en.md new file mode 100644 index 00000000000000..1e08efd167e2c0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-2d_oomv1_800_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English 2d_oomv1_800_pipeline pipeline BertForSequenceClassification from abbassix +author: John Snow Labs +name: 2d_oomv1_800_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`2d_oomv1_800_pipeline` is a English model originally trained by abbassix. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/2d_oomv1_800_pipeline_en_5.5.0_3.0_1727351434190.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/2d_oomv1_800_pipeline_en_5.5.0_3.0_1727351434190.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("2d_oomv1_800_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("2d_oomv1_800_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|2d_oomv1_800_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/abbassix/2d_oomv1_800 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-absa_bert_model_en.md b/docs/_posts/ahmedlone127/2024-09-26-absa_bert_model_en.md new file mode 100644 index 00000000000000..b6adcf691319a5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-absa_bert_model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English absa_bert_model BertForSequenceClassification from Bareubara +author: John Snow Labs +name: absa_bert_model +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`absa_bert_model` is a English model originally trained by Bareubara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/absa_bert_model_en_5.5.0_3.0_1727341987684.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/absa_bert_model_en_5.5.0_3.0_1727341987684.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("absa_bert_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("absa_bert_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|absa_bert_model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|669.3 MB| + +## References + +https://huggingface.co/Bareubara/absa-bert-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-absa_bert_model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-absa_bert_model_pipeline_en.md new file mode 100644 index 00000000000000..edaaced77736d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-absa_bert_model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English absa_bert_model_pipeline pipeline BertForSequenceClassification from Bareubara +author: John Snow Labs +name: absa_bert_model_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`absa_bert_model_pipeline` is a English model originally trained by Bareubara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/absa_bert_model_pipeline_en_5.5.0_3.0_1727342022951.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/absa_bert_model_pipeline_en_5.5.0_3.0_1727342022951.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("absa_bert_model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("absa_bert_model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|absa_bert_model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|669.3 MB| + +## References + +https://huggingface.co/Bareubara/absa-bert-model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-absabert_keluhanpln_v3_id.md b/docs/_posts/ahmedlone127/2024-09-26-absabert_keluhanpln_v3_id.md new file mode 100644 index 00000000000000..3d0b0e92560543 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-absabert_keluhanpln_v3_id.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Indonesian absabert_keluhanpln_v3 BertForSequenceClassification from radityapranata +author: John Snow Labs +name: absabert_keluhanpln_v3 +date: 2024-09-26 +tags: [id, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: id +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`absabert_keluhanpln_v3` is a Indonesian model originally trained by radityapranata. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/absabert_keluhanpln_v3_id_5.5.0_3.0_1727364190803.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/absabert_keluhanpln_v3_id_5.5.0_3.0_1727364190803.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("absabert_keluhanpln_v3","id") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("absabert_keluhanpln_v3", "id") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|absabert_keluhanpln_v3| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|id| +|Size:|409.4 MB| + +## References + +https://huggingface.co/radityapranata/absabert-keluhanpln-v3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-accelerate_training_loop_sst2_en.md b/docs/_posts/ahmedlone127/2024-09-26-accelerate_training_loop_sst2_en.md new file mode 100644 index 00000000000000..32b4b7ba7cfd5d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-accelerate_training_loop_sst2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English accelerate_training_loop_sst2 BertForSequenceClassification from Edward47 +author: John Snow Labs +name: accelerate_training_loop_sst2 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`accelerate_training_loop_sst2` is a English model originally trained by Edward47. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/accelerate_training_loop_sst2_en_5.5.0_3.0_1727312740694.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/accelerate_training_loop_sst2_en_5.5.0_3.0_1727312740694.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("accelerate_training_loop_sst2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("accelerate_training_loop_sst2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|accelerate_training_loop_sst2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Edward47/accelerate_training_loop_sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-aes_bert_base_lr3e_05_wr1e_01_wd1e_02_ep5_bell_en.md b/docs/_posts/ahmedlone127/2024-09-26-aes_bert_base_lr3e_05_wr1e_01_wd1e_02_ep5_bell_en.md new file mode 100644 index 00000000000000..695f549aa6eb25 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-aes_bert_base_lr3e_05_wr1e_01_wd1e_02_ep5_bell_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English aes_bert_base_lr3e_05_wr1e_01_wd1e_02_ep5_bell BertForSequenceClassification from ys7yoo +author: John Snow Labs +name: aes_bert_base_lr3e_05_wr1e_01_wd1e_02_ep5_bell +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`aes_bert_base_lr3e_05_wr1e_01_wd1e_02_ep5_bell` is a English model originally trained by ys7yoo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/aes_bert_base_lr3e_05_wr1e_01_wd1e_02_ep5_bell_en_5.5.0_3.0_1727354354211.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/aes_bert_base_lr3e_05_wr1e_01_wd1e_02_ep5_bell_en_5.5.0_3.0_1727354354211.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("aes_bert_base_lr3e_05_wr1e_01_wd1e_02_ep5_bell","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("aes_bert_base_lr3e_05_wr1e_01_wd1e_02_ep5_bell", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|aes_bert_base_lr3e_05_wr1e_01_wd1e_02_ep5_bell| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.7 MB| + +## References + +https://huggingface.co/ys7yoo/aes_bert-base_lr3e-05_wr1e-01_wd1e-02_ep5_bell \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-afrisenti_yor_regression_en.md b/docs/_posts/ahmedlone127/2024-09-26-afrisenti_yor_regression_en.md new file mode 100644 index 00000000000000..1142487995165e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-afrisenti_yor_regression_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English afrisenti_yor_regression BertForSequenceClassification from HausaNLP +author: John Snow Labs +name: afrisenti_yor_regression +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`afrisenti_yor_regression` is a English model originally trained by HausaNLP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/afrisenti_yor_regression_en_5.5.0_3.0_1727351344342.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/afrisenti_yor_regression_en_5.5.0_3.0_1727351344342.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("afrisenti_yor_regression","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("afrisenti_yor_regression", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|afrisenti_yor_regression| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/HausaNLP/afrisenti-yor-regression \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-afrisenti_yor_regression_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-afrisenti_yor_regression_pipeline_en.md new file mode 100644 index 00000000000000..7fb21f70aa9066 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-afrisenti_yor_regression_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English afrisenti_yor_regression_pipeline pipeline BertForSequenceClassification from HausaNLP +author: John Snow Labs +name: afrisenti_yor_regression_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`afrisenti_yor_regression_pipeline` is a English model originally trained by HausaNLP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/afrisenti_yor_regression_pipeline_en_5.5.0_3.0_1727351378372.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/afrisenti_yor_regression_pipeline_en_5.5.0_3.0_1727351378372.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("afrisenti_yor_regression_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("afrisenti_yor_regression_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|afrisenti_yor_regression_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/HausaNLP/afrisenti-yor-regression + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-albert_base_finetuned_ctrip_zh.md b/docs/_posts/ahmedlone127/2024-09-26-albert_base_finetuned_ctrip_zh.md new file mode 100644 index 00000000000000..616242dece7a0e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-albert_base_finetuned_ctrip_zh.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Chinese albert_base_finetuned_ctrip BertForSequenceClassification from WangA +author: John Snow Labs +name: albert_base_finetuned_ctrip +date: 2024-09-26 +tags: [zh, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: zh +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`albert_base_finetuned_ctrip` is a Chinese model originally trained by WangA. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/albert_base_finetuned_ctrip_zh_5.5.0_3.0_1727329026848.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/albert_base_finetuned_ctrip_zh_5.5.0_3.0_1727329026848.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("albert_base_finetuned_ctrip","zh") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("albert_base_finetuned_ctrip", "zh") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|albert_base_finetuned_ctrip| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|39.7 MB| + +## References + +https://huggingface.co/WangA/albert-base-finetuned-ctrip \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-albert_base_finetuned_jd_zh.md b/docs/_posts/ahmedlone127/2024-09-26-albert_base_finetuned_jd_zh.md new file mode 100644 index 00000000000000..fd524bae559a09 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-albert_base_finetuned_jd_zh.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Chinese albert_base_finetuned_jd BertForSequenceClassification from WangA +author: John Snow Labs +name: albert_base_finetuned_jd +date: 2024-09-26 +tags: [zh, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: zh +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`albert_base_finetuned_jd` is a Chinese model originally trained by WangA. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/albert_base_finetuned_jd_zh_5.5.0_3.0_1727346144393.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/albert_base_finetuned_jd_zh_5.5.0_3.0_1727346144393.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("albert_base_finetuned_jd","zh") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("albert_base_finetuned_jd", "zh") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|albert_base_finetuned_jd| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|39.6 MB| + +## References + +https://huggingface.co/WangA/albert-base-finetuned-jd \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-albert_base_finetuned_ocnli_chinese_pipeline_zh.md b/docs/_posts/ahmedlone127/2024-09-26-albert_base_finetuned_ocnli_chinese_pipeline_zh.md new file mode 100644 index 00000000000000..5e1c0e07b872e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-albert_base_finetuned_ocnli_chinese_pipeline_zh.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Chinese albert_base_finetuned_ocnli_chinese_pipeline pipeline BertForSequenceClassification from WangA +author: John Snow Labs +name: albert_base_finetuned_ocnli_chinese_pipeline +date: 2024-09-26 +tags: [zh, open_source, pipeline, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`albert_base_finetuned_ocnli_chinese_pipeline` is a Chinese model originally trained by WangA. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/albert_base_finetuned_ocnli_chinese_pipeline_zh_5.5.0_3.0_1727364512858.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/albert_base_finetuned_ocnli_chinese_pipeline_zh_5.5.0_3.0_1727364512858.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("albert_base_finetuned_ocnli_chinese_pipeline", lang = "zh") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("albert_base_finetuned_ocnli_chinese_pipeline", lang = "zh") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|albert_base_finetuned_ocnli_chinese_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|zh| +|Size:|39.7 MB| + +## References + +https://huggingface.co/WangA/albert-base-finetuned-ocnli-chinese + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-albert_books_test_en.md b/docs/_posts/ahmedlone127/2024-09-26-albert_books_test_en.md new file mode 100644 index 00000000000000..ac5a534431aaff --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-albert_books_test_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English albert_books_test BertForSequenceClassification from qingmou +author: John Snow Labs +name: albert_books_test +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`albert_books_test` is a English model originally trained by qingmou. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/albert_books_test_en_5.5.0_3.0_1727329109154.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/albert_books_test_en_5.5.0_3.0_1727329109154.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("albert_books_test","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("albert_books_test", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|albert_books_test| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|15.6 MB| + +## References + +https://huggingface.co/qingmou/albert-books-test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-albert_books_test_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-albert_books_test_pipeline_en.md new file mode 100644 index 00000000000000..0df2e1437a1796 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-albert_books_test_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English albert_books_test_pipeline pipeline BertForSequenceClassification from qingmou +author: John Snow Labs +name: albert_books_test_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`albert_books_test_pipeline` is a English model originally trained by qingmou. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/albert_books_test_pipeline_en_5.5.0_3.0_1727329110283.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/albert_books_test_pipeline_en_5.5.0_3.0_1727329110283.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("albert_books_test_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("albert_books_test_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|albert_books_test_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|15.6 MB| + +## References + +https://huggingface.co/qingmou/albert-books-test + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-albert_dzaveri_en.md b/docs/_posts/ahmedlone127/2024-09-26-albert_dzaveri_en.md new file mode 100644 index 00000000000000..483ff21a5212db --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-albert_dzaveri_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English albert_dzaveri BertForSequenceClassification from dzaveri +author: John Snow Labs +name: albert_dzaveri +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`albert_dzaveri` is a English model originally trained by dzaveri. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/albert_dzaveri_en_5.5.0_3.0_1727335218794.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/albert_dzaveri_en_5.5.0_3.0_1727335218794.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("albert_dzaveri","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("albert_dzaveri", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|albert_dzaveri| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|126.0 MB| + +## References + +https://huggingface.co/dzaveri/albert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-albert_jiiyy_en.md b/docs/_posts/ahmedlone127/2024-09-26-albert_jiiyy_en.md new file mode 100644 index 00000000000000..05b38daa5f7da9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-albert_jiiyy_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English albert_jiiyy BertForSequenceClassification from jiiyy +author: John Snow Labs +name: albert_jiiyy +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`albert_jiiyy` is a English model originally trained by jiiyy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/albert_jiiyy_en_5.5.0_3.0_1727361296636.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/albert_jiiyy_en_5.5.0_3.0_1727361296636.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("albert_jiiyy","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("albert_jiiyy", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|albert_jiiyy| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|49.9 MB| + +## References + +https://huggingface.co/jiiyy/albert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-albert_kor_base_finetuned_classfication_en.md b/docs/_posts/ahmedlone127/2024-09-26-albert_kor_base_finetuned_classfication_en.md new file mode 100644 index 00000000000000..c705645daba54d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-albert_kor_base_finetuned_classfication_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English albert_kor_base_finetuned_classfication BertForSequenceClassification from smjung +author: John Snow Labs +name: albert_kor_base_finetuned_classfication +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`albert_kor_base_finetuned_classfication` is a English model originally trained by smjung. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/albert_kor_base_finetuned_classfication_en_5.5.0_3.0_1727366005912.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/albert_kor_base_finetuned_classfication_en_5.5.0_3.0_1727366005912.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("albert_kor_base_finetuned_classfication","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("albert_kor_base_finetuned_classfication", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|albert_kor_base_finetuned_classfication| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|49.9 MB| + +## References + +https://huggingface.co/smjung/albert-kor-base-finetuned-classfication \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-albert_kor_base_finetuned_ynat_en.md b/docs/_posts/ahmedlone127/2024-09-26-albert_kor_base_finetuned_ynat_en.md new file mode 100644 index 00000000000000..6b44bdef755eb2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-albert_kor_base_finetuned_ynat_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English albert_kor_base_finetuned_ynat BertForSequenceClassification from smjung +author: John Snow Labs +name: albert_kor_base_finetuned_ynat +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`albert_kor_base_finetuned_ynat` is a English model originally trained by smjung. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/albert_kor_base_finetuned_ynat_en_5.5.0_3.0_1727366534251.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/albert_kor_base_finetuned_ynat_en_5.5.0_3.0_1727366534251.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("albert_kor_base_finetuned_ynat","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("albert_kor_base_finetuned_ynat", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|albert_kor_base_finetuned_ynat| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|49.9 MB| + +## References + +https://huggingface.co/smjung/albert-kor-base-finetuned-ynat \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-albertobertnews_en.md b/docs/_posts/ahmedlone127/2024-09-26-albertobertnews_en.md new file mode 100644 index 00000000000000..8f43188cfcdc55 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-albertobertnews_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English albertobertnews BertForSequenceClassification from GioReg +author: John Snow Labs +name: albertobertnews +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`albertobertnews` is a English model originally trained by GioReg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/albertobertnews_en_5.5.0_3.0_1727347615427.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/albertobertnews_en_5.5.0_3.0_1727347615427.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("albertobertnews","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("albertobertnews", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|albertobertnews| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|691.9 MB| + +## References + +https://huggingface.co/GioReg/AlbertoBertnews \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-amazon_cross_encoder_classification_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-amazon_cross_encoder_classification_pipeline_en.md new file mode 100644 index 00000000000000..c44138f8dd3a0e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-amazon_cross_encoder_classification_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English amazon_cross_encoder_classification_pipeline pipeline BertForSequenceClassification from LiYuan +author: John Snow Labs +name: amazon_cross_encoder_classification_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`amazon_cross_encoder_classification_pipeline` is a English model originally trained by LiYuan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/amazon_cross_encoder_classification_pipeline_en_5.5.0_3.0_1727338863960.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/amazon_cross_encoder_classification_pipeline_en_5.5.0_3.0_1727338863960.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("amazon_cross_encoder_classification_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("amazon_cross_encoder_classification_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|amazon_cross_encoder_classification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|414.2 MB| + +## References + +https://huggingface.co/LiYuan/Amazon-Cross-Encoder-Classification + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-amazon_rating_review_model_en.md b/docs/_posts/ahmedlone127/2024-09-26-amazon_rating_review_model_en.md new file mode 100644 index 00000000000000..e540ed3b015706 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-amazon_rating_review_model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English amazon_rating_review_model BertForSequenceClassification from MahmoudMohamed +author: John Snow Labs +name: amazon_rating_review_model +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`amazon_rating_review_model` is a English model originally trained by MahmoudMohamed. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/amazon_rating_review_model_en_5.5.0_3.0_1727339421274.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/amazon_rating_review_model_en_5.5.0_3.0_1727339421274.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("amazon_rating_review_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("amazon_rating_review_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|amazon_rating_review_model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|627.7 MB| + +## References + +https://huggingface.co/MahmoudMohamed/Amazon_rating_review_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-amazon_reviews_finetuning_bert_base_sentiment_en.md b/docs/_posts/ahmedlone127/2024-09-26-amazon_reviews_finetuning_bert_base_sentiment_en.md new file mode 100644 index 00000000000000..31db449fe03070 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-amazon_reviews_finetuning_bert_base_sentiment_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English amazon_reviews_finetuning_bert_base_sentiment BertForSequenceClassification from santiviquez +author: John Snow Labs +name: amazon_reviews_finetuning_bert_base_sentiment +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`amazon_reviews_finetuning_bert_base_sentiment` is a English model originally trained by santiviquez. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/amazon_reviews_finetuning_bert_base_sentiment_en_5.5.0_3.0_1727342256301.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/amazon_reviews_finetuning_bert_base_sentiment_en_5.5.0_3.0_1727342256301.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("amazon_reviews_finetuning_bert_base_sentiment","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("amazon_reviews_finetuning_bert_base_sentiment", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|amazon_reviews_finetuning_bert_base_sentiment| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|627.8 MB| + +## References + +https://huggingface.co/santiviquez/amazon-reviews-finetuning-bert-base-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-amazon_reviews_finetuning_bert_base_sentiment_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-amazon_reviews_finetuning_bert_base_sentiment_pipeline_en.md new file mode 100644 index 00000000000000..46b7a1b2055d39 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-amazon_reviews_finetuning_bert_base_sentiment_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English amazon_reviews_finetuning_bert_base_sentiment_pipeline pipeline BertForSequenceClassification from santiviquez +author: John Snow Labs +name: amazon_reviews_finetuning_bert_base_sentiment_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`amazon_reviews_finetuning_bert_base_sentiment_pipeline` is a English model originally trained by santiviquez. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/amazon_reviews_finetuning_bert_base_sentiment_pipeline_en_5.5.0_3.0_1727342290054.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/amazon_reviews_finetuning_bert_base_sentiment_pipeline_en_5.5.0_3.0_1727342290054.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("amazon_reviews_finetuning_bert_base_sentiment_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("amazon_reviews_finetuning_bert_base_sentiment_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|amazon_reviews_finetuning_bert_base_sentiment_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|627.8 MB| + +## References + +https://huggingface.co/santiviquez/amazon-reviews-finetuning-bert-base-sentiment + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-android_ios_classification_en.md b/docs/_posts/ahmedlone127/2024-09-26-android_ios_classification_en.md new file mode 100644 index 00000000000000..308e4772601b4c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-android_ios_classification_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English android_ios_classification BertForSequenceClassification from EasthShin +author: John Snow Labs +name: android_ios_classification +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`android_ios_classification` is a English model originally trained by EasthShin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/android_ios_classification_en_5.5.0_3.0_1727335627806.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/android_ios_classification_en_5.5.0_3.0_1727335627806.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("android_ios_classification","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("android_ios_classification", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|android_ios_classification| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/EasthShin/Android_Ios_Classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-android_ios_classification_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-android_ios_classification_pipeline_en.md new file mode 100644 index 00000000000000..9c7fc690852992 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-android_ios_classification_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English android_ios_classification_pipeline pipeline BertForSequenceClassification from EasthShin +author: John Snow Labs +name: android_ios_classification_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`android_ios_classification_pipeline` is a English model originally trained by EasthShin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/android_ios_classification_pipeline_en_5.5.0_3.0_1727335648636.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/android_ios_classification_pipeline_en_5.5.0_3.0_1727335648636.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("android_ios_classification_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("android_ios_classification_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|android_ios_classification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/EasthShin/Android_Ios_Classification + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-arabert_restaurant_sentiment_ar.md b/docs/_posts/ahmedlone127/2024-09-26-arabert_restaurant_sentiment_ar.md new file mode 100644 index 00000000000000..6420a66638e25d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-arabert_restaurant_sentiment_ar.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Arabic arabert_restaurant_sentiment BertForSequenceClassification from moazx +author: John Snow Labs +name: arabert_restaurant_sentiment +date: 2024-09-26 +tags: [ar, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: ar +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`arabert_restaurant_sentiment` is a Arabic model originally trained by moazx. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/arabert_restaurant_sentiment_ar_5.5.0_3.0_1727355508253.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/arabert_restaurant_sentiment_ar_5.5.0_3.0_1727355508253.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("arabert_restaurant_sentiment","ar") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("arabert_restaurant_sentiment", "ar") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|arabert_restaurant_sentiment| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ar| +|Size:|507.3 MB| + +## References + +https://huggingface.co/moazx/AraBERT-Restaurant-Sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-arabert_restaurant_sentiment_pipeline_ar.md b/docs/_posts/ahmedlone127/2024-09-26-arabert_restaurant_sentiment_pipeline_ar.md new file mode 100644 index 00000000000000..68934a1e81b54b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-arabert_restaurant_sentiment_pipeline_ar.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Arabic arabert_restaurant_sentiment_pipeline pipeline BertForSequenceClassification from moazx +author: John Snow Labs +name: arabert_restaurant_sentiment_pipeline +date: 2024-09-26 +tags: [ar, open_source, pipeline, onnx] +task: Text Classification +language: ar +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`arabert_restaurant_sentiment_pipeline` is a Arabic model originally trained by moazx. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/arabert_restaurant_sentiment_pipeline_ar_5.5.0_3.0_1727355534564.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/arabert_restaurant_sentiment_pipeline_ar_5.5.0_3.0_1727355534564.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("arabert_restaurant_sentiment_pipeline", lang = "ar") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("arabert_restaurant_sentiment_pipeline", lang = "ar") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|arabert_restaurant_sentiment_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|ar| +|Size:|507.3 MB| + +## References + +https://huggingface.co/moazx/AraBERT-Restaurant-Sentiment + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-arabertv2_fully_supervised_arabic_propaganda_en.md b/docs/_posts/ahmedlone127/2024-09-26-arabertv2_fully_supervised_arabic_propaganda_en.md new file mode 100644 index 00000000000000..048bcc4e2a4659 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-arabertv2_fully_supervised_arabic_propaganda_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English arabertv2_fully_supervised_arabic_propaganda BertForSequenceClassification from Bmalmotairy +author: John Snow Labs +name: arabertv2_fully_supervised_arabic_propaganda +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`arabertv2_fully_supervised_arabic_propaganda` is a English model originally trained by Bmalmotairy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/arabertv2_fully_supervised_arabic_propaganda_en_5.5.0_3.0_1727316807057.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/arabertv2_fully_supervised_arabic_propaganda_en_5.5.0_3.0_1727316807057.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("arabertv2_fully_supervised_arabic_propaganda","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("arabertv2_fully_supervised_arabic_propaganda", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|arabertv2_fully_supervised_arabic_propaganda| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|507.1 MB| + +## References + +https://huggingface.co/Bmalmotairy/arabertv2-fully-supervised-arabic-propaganda \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-arabglossbert_pipeline_ar.md b/docs/_posts/ahmedlone127/2024-09-26-arabglossbert_pipeline_ar.md new file mode 100644 index 00000000000000..c539460f98ed50 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-arabglossbert_pipeline_ar.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Arabic arabglossbert_pipeline pipeline BertForSequenceClassification from SinaLab +author: John Snow Labs +name: arabglossbert_pipeline +date: 2024-09-26 +tags: [ar, open_source, pipeline, onnx] +task: Text Classification +language: ar +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`arabglossbert_pipeline` is a Arabic model originally trained by SinaLab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/arabglossbert_pipeline_ar_5.5.0_3.0_1727321747819.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/arabglossbert_pipeline_ar_5.5.0_3.0_1727321747819.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("arabglossbert_pipeline", lang = "ar") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("arabglossbert_pipeline", lang = "ar") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|arabglossbert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|ar| +|Size:|507.3 MB| + +## References + +https://huggingface.co/SinaLab/ArabGlossBERT + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-arabicsent_chamabert_ar.md b/docs/_posts/ahmedlone127/2024-09-26-arabicsent_chamabert_ar.md new file mode 100644 index 00000000000000..610e8b0746f089 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-arabicsent_chamabert_ar.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Arabic arabicsent_chamabert BertForSequenceClassification from ChaimaaBouafoud +author: John Snow Labs +name: arabicsent_chamabert +date: 2024-09-26 +tags: [ar, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: ar +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`arabicsent_chamabert` is a Arabic model originally trained by ChaimaaBouafoud. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/arabicsent_chamabert_ar_5.5.0_3.0_1727357287003.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/arabicsent_chamabert_ar_5.5.0_3.0_1727357287003.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("arabicsent_chamabert","ar") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("arabicsent_chamabert", "ar") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|arabicsent_chamabert| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ar| +|Size:|507.3 MB| + +## References + +https://huggingface.co/ChaimaaBouafoud/arabicSent-ChamaBert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-arabicsent_chamabert_pipeline_ar.md b/docs/_posts/ahmedlone127/2024-09-26-arabicsent_chamabert_pipeline_ar.md new file mode 100644 index 00000000000000..98e98190c0f019 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-arabicsent_chamabert_pipeline_ar.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Arabic arabicsent_chamabert_pipeline pipeline BertForSequenceClassification from ChaimaaBouafoud +author: John Snow Labs +name: arabicsent_chamabert_pipeline +date: 2024-09-26 +tags: [ar, open_source, pipeline, onnx] +task: Text Classification +language: ar +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`arabicsent_chamabert_pipeline` is a Arabic model originally trained by ChaimaaBouafoud. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/arabicsent_chamabert_pipeline_ar_5.5.0_3.0_1727357313430.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/arabicsent_chamabert_pipeline_ar_5.5.0_3.0_1727357313430.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("arabicsent_chamabert_pipeline", lang = "ar") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("arabicsent_chamabert_pipeline", lang = "ar") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|arabicsent_chamabert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|ar| +|Size:|507.3 MB| + +## References + +https://huggingface.co/ChaimaaBouafoud/arabicSent-ChamaBert + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-autotrain_bertbase_imdb_1275748790_en.md b/docs/_posts/ahmedlone127/2024-09-26-autotrain_bertbase_imdb_1275748790_en.md new file mode 100644 index 00000000000000..0f9816a47dd675 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-autotrain_bertbase_imdb_1275748790_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English autotrain_bertbase_imdb_1275748790 BertForSequenceClassification from sasha +author: John Snow Labs +name: autotrain_bertbase_imdb_1275748790 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_bertbase_imdb_1275748790` is a English model originally trained by sasha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_bertbase_imdb_1275748790_en_5.5.0_3.0_1727343243596.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_bertbase_imdb_1275748790_en_5.5.0_3.0_1727343243596.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("autotrain_bertbase_imdb_1275748790","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("autotrain_bertbase_imdb_1275748790", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_bertbase_imdb_1275748790| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/sasha/autotrain-BERTBase-imdb-1275748790 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-autotrain_customers_email_sentiment_3449294006_en.md b/docs/_posts/ahmedlone127/2024-09-26-autotrain_customers_email_sentiment_3449294006_en.md new file mode 100644 index 00000000000000..27d6217c5e98af --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-autotrain_customers_email_sentiment_3449294006_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English autotrain_customers_email_sentiment_3449294006 BertForSequenceClassification from zabiullah +author: John Snow Labs +name: autotrain_customers_email_sentiment_3449294006 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_customers_email_sentiment_3449294006` is a English model originally trained by zabiullah. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_customers_email_sentiment_3449294006_en_5.5.0_3.0_1727356761628.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_customers_email_sentiment_3449294006_en_5.5.0_3.0_1727356761628.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("autotrain_customers_email_sentiment_3449294006","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("autotrain_customers_email_sentiment_3449294006", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_customers_email_sentiment_3449294006| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/zabiullah/autotrain-customers_email_sentiment-3449294006 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-autotrain_customers_email_sentiment_3449294006_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-autotrain_customers_email_sentiment_3449294006_pipeline_en.md new file mode 100644 index 00000000000000..61bb7620a5adbe --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-autotrain_customers_email_sentiment_3449294006_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English autotrain_customers_email_sentiment_3449294006_pipeline pipeline BertForSequenceClassification from zabiullah +author: John Snow Labs +name: autotrain_customers_email_sentiment_3449294006_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_customers_email_sentiment_3449294006_pipeline` is a English model originally trained by zabiullah. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_customers_email_sentiment_3449294006_pipeline_en_5.5.0_3.0_1727356824234.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_customers_email_sentiment_3449294006_pipeline_en_5.5.0_3.0_1727356824234.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("autotrain_customers_email_sentiment_3449294006_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("autotrain_customers_email_sentiment_3449294006_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_customers_email_sentiment_3449294006_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/zabiullah/autotrain-customers_email_sentiment-3449294006 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-banking77_text_classification_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-banking77_text_classification_pipeline_en.md new file mode 100644 index 00000000000000..959e6d6d47c689 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-banking77_text_classification_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English banking77_text_classification_pipeline pipeline BertForSequenceClassification from ramnathv +author: John Snow Labs +name: banking77_text_classification_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`banking77_text_classification_pipeline` is a English model originally trained by ramnathv. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/banking77_text_classification_pipeline_en_5.5.0_3.0_1727339734496.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/banking77_text_classification_pipeline_en_5.5.0_3.0_1727339734496.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("banking77_text_classification_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("banking77_text_classification_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|banking77_text_classification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/ramnathv/banking77-text-classification + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-base_bert_fine_tuned_rte_en.md b/docs/_posts/ahmedlone127/2024-09-26-base_bert_fine_tuned_rte_en.md new file mode 100644 index 00000000000000..5c18eb2bb9c206 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-base_bert_fine_tuned_rte_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English base_bert_fine_tuned_rte BertForSequenceClassification from rycecorn +author: John Snow Labs +name: base_bert_fine_tuned_rte +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`base_bert_fine_tuned_rte` is a English model originally trained by rycecorn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/base_bert_fine_tuned_rte_en_5.5.0_3.0_1727312115801.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/base_bert_fine_tuned_rte_en_5.5.0_3.0_1727312115801.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("base_bert_fine_tuned_rte","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("base_bert_fine_tuned_rte", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|base_bert_fine_tuned_rte| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/rycecorn/base-bert-fine-tuned-RTE \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-base_bert_fine_tuned_rte_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-base_bert_fine_tuned_rte_pipeline_en.md new file mode 100644 index 00000000000000..d70ad20b06fef4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-base_bert_fine_tuned_rte_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English base_bert_fine_tuned_rte_pipeline pipeline BertForSequenceClassification from rycecorn +author: John Snow Labs +name: base_bert_fine_tuned_rte_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`base_bert_fine_tuned_rte_pipeline` is a English model originally trained by rycecorn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/base_bert_fine_tuned_rte_pipeline_en_5.5.0_3.0_1727312137197.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/base_bert_fine_tuned_rte_pipeline_en_5.5.0_3.0_1727312137197.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("base_bert_fine_tuned_rte_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("base_bert_fine_tuned_rte_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|base_bert_fine_tuned_rte_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/rycecorn/base-bert-fine-tuned-RTE + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_202k_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_202k_en.md new file mode 100644 index 00000000000000..c8f93d047fe466 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_202k_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_202k BertForSequenceClassification from 202k +author: John Snow Labs +name: bert_202k +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_202k` is a English model originally trained by 202k. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_202k_en_5.5.0_3.0_1727326457560.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_202k_en_5.5.0_3.0_1727326457560.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_202k","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_202k", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_202k| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/202k/bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_43_multilabel_emotion_detection_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_43_multilabel_emotion_detection_en.md new file mode 100644 index 00000000000000..a25d03dec50520 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_43_multilabel_emotion_detection_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_43_multilabel_emotion_detection BertForSequenceClassification from borisn70 +author: John Snow Labs +name: bert_43_multilabel_emotion_detection +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_43_multilabel_emotion_detection` is a English model originally trained by borisn70. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_43_multilabel_emotion_detection_en_5.5.0_3.0_1727370925464.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_43_multilabel_emotion_detection_en_5.5.0_3.0_1727370925464.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_43_multilabel_emotion_detection","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_43_multilabel_emotion_detection", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_43_multilabel_emotion_detection| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/borisn70/bert-43-multilabel-emotion-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_aigc_classification_english_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_aigc_classification_english_en.md new file mode 100644 index 00000000000000..50a7c79c0ae9bb --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_aigc_classification_english_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_aigc_classification_english BertForSequenceClassification from youssefkhalil320 +author: John Snow Labs +name: bert_aigc_classification_english +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_aigc_classification_english` is a English model originally trained by youssefkhalil320. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_aigc_classification_english_en_5.5.0_3.0_1727340080914.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_aigc_classification_english_en_5.5.0_3.0_1727340080914.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_aigc_classification_english","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_aigc_classification_english", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_aigc_classification_english| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/youssefkhalil320/bert-AIGC-classification-en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_aigc_classification_english_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_aigc_classification_english_pipeline_en.md new file mode 100644 index 00000000000000..796c917b80017a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_aigc_classification_english_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_aigc_classification_english_pipeline pipeline BertForSequenceClassification from youssefkhalil320 +author: John Snow Labs +name: bert_aigc_classification_english_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_aigc_classification_english_pipeline` is a English model originally trained by youssefkhalil320. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_aigc_classification_english_pipeline_en_5.5.0_3.0_1727340102646.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_aigc_classification_english_pipeline_en_5.5.0_3.0_1727340102646.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_aigc_classification_english_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_aigc_classification_english_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_aigc_classification_english_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/youssefkhalil320/bert-AIGC-classification-en + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_arc_code_personality_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_arc_code_personality_en.md new file mode 100644 index 00000000000000..743c1089dc49ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_arc_code_personality_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_arc_code_personality BertForSequenceClassification from AdithyaSK +author: John Snow Labs +name: bert_arc_code_personality +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_arc_code_personality` is a English model originally trained by AdithyaSK. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_arc_code_personality_en_5.5.0_3.0_1727346724159.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_arc_code_personality_en_5.5.0_3.0_1727346724159.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_arc_code_personality","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_arc_code_personality", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_arc_code_personality| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/AdithyaSK/Bert_arc_code_personality \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_azahead_v1_0_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_azahead_v1_0_en.md new file mode 100644 index 00000000000000..90ecb0e7e899cb --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_azahead_v1_0_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_azahead_v1_0 BertForSequenceClassification from zwellington +author: John Snow Labs +name: bert_azahead_v1_0 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_azahead_v1_0` is a English model originally trained by zwellington. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_azahead_v1_0_en_5.5.0_3.0_1727351508182.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_azahead_v1_0_en_5.5.0_3.0_1727351508182.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_azahead_v1_0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_azahead_v1_0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_azahead_v1_0| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/zwellington/bert-azahead-v1.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_azahead_v1_0_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_azahead_v1_0_pipeline_en.md new file mode 100644 index 00000000000000..1ac298f33241ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_azahead_v1_0_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_azahead_v1_0_pipeline pipeline BertForSequenceClassification from zwellington +author: John Snow Labs +name: bert_azahead_v1_0_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_azahead_v1_0_pipeline` is a English model originally trained by zwellington. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_azahead_v1_0_pipeline_en_5.5.0_3.0_1727351529937.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_azahead_v1_0_pipeline_en_5.5.0_3.0_1727351529937.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_azahead_v1_0_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_azahead_v1_0_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_azahead_v1_0_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/zwellington/bert-azahead-v1.0 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_arabert_finetuned_s2d_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_arabert_finetuned_s2d_pipeline_en.md new file mode 100644 index 00000000000000..8163888ab896fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_arabert_finetuned_s2d_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_arabert_finetuned_s2d_pipeline pipeline BertForSequenceClassification from IsmailRabii +author: John Snow Labs +name: bert_base_arabert_finetuned_s2d_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_arabert_finetuned_s2d_pipeline` is a English model originally trained by IsmailRabii. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_arabert_finetuned_s2d_pipeline_en_5.5.0_3.0_1727355865186.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_arabert_finetuned_s2d_pipeline_en_5.5.0_3.0_1727355865186.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_arabert_finetuned_s2d_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_arabert_finetuned_s2d_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_arabert_finetuned_s2d_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|506.9 MB| + +## References + +https://huggingface.co/IsmailRabii/bert-base-arabert-finetuned-S2D + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_arabertv2_1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_arabertv2_1_pipeline_en.md new file mode 100644 index 00000000000000..0ba0face8047f7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_arabertv2_1_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_arabertv2_1_pipeline pipeline BertForSequenceClassification from elsayedissa +author: John Snow Labs +name: bert_base_arabertv2_1_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_arabertv2_1_pipeline` is a English model originally trained by elsayedissa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_arabertv2_1_pipeline_en_5.5.0_3.0_1727346294106.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_arabertv2_1_pipeline_en_5.5.0_3.0_1727346294106.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_arabertv2_1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_arabertv2_1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_arabertv2_1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|507.1 MB| + +## References + +https://huggingface.co/elsayedissa/bert-base-arabertv2_1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_arabic_camelbert_catalan_tydi_pairs_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_arabic_camelbert_catalan_tydi_pairs_en.md new file mode 100644 index 00000000000000..6216dd4cc08f7d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_arabic_camelbert_catalan_tydi_pairs_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_arabic_camelbert_catalan_tydi_pairs BertForSequenceClassification from MatMulMan +author: John Snow Labs +name: bert_base_arabic_camelbert_catalan_tydi_pairs +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_arabic_camelbert_catalan_tydi_pairs` is a English model originally trained by MatMulMan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_arabic_camelbert_catalan_tydi_pairs_en_5.5.0_3.0_1727315783417.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_arabic_camelbert_catalan_tydi_pairs_en_5.5.0_3.0_1727315783417.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_arabic_camelbert_catalan_tydi_pairs","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_arabic_camelbert_catalan_tydi_pairs", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_arabic_camelbert_catalan_tydi_pairs| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.8 MB| + +## References + +https://huggingface.co/MatMulMan/bert-base-arabic-camelbert-ca-tydi-pairs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_arabic_electra_xnli_finetuned_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_arabic_electra_xnli_finetuned_en.md new file mode 100644 index 00000000000000..bc3ed393ca12b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_arabic_electra_xnli_finetuned_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_arabic_electra_xnli_finetuned BertForSequenceClassification from SarahAdnan +author: John Snow Labs +name: bert_base_arabic_electra_xnli_finetuned +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_arabic_electra_xnli_finetuned` is a English model originally trained by SarahAdnan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_arabic_electra_xnli_finetuned_en_5.5.0_3.0_1727347710744.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_arabic_electra_xnli_finetuned_en_5.5.0_3.0_1727347710744.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_arabic_electra_xnli_finetuned","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_arabic_electra_xnli_finetuned", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_arabic_electra_xnli_finetuned| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|506.5 MB| + +## References + +https://huggingface.co/SarahAdnan/bert-base-arabic-electra-xnli-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_arabic_electra_xnli_finetuned_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_arabic_electra_xnli_finetuned_pipeline_en.md new file mode 100644 index 00000000000000..49a2e3e2ac5d03 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_arabic_electra_xnli_finetuned_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_arabic_electra_xnli_finetuned_pipeline pipeline BertForSequenceClassification from SarahAdnan +author: John Snow Labs +name: bert_base_arabic_electra_xnli_finetuned_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_arabic_electra_xnli_finetuned_pipeline` is a English model originally trained by SarahAdnan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_arabic_electra_xnli_finetuned_pipeline_en_5.5.0_3.0_1727347737494.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_arabic_electra_xnli_finetuned_pipeline_en_5.5.0_3.0_1727347737494.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_arabic_electra_xnli_finetuned_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_arabic_electra_xnli_finetuned_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_arabic_electra_xnli_finetuned_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|506.5 MB| + +## References + +https://huggingface.co/SarahAdnan/bert-base-arabic-electra-xnli-finetuned + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_arabicbert_xnli_finetuned_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_arabicbert_xnli_finetuned_en.md new file mode 100644 index 00000000000000..458d6b362e6f70 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_arabicbert_xnli_finetuned_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_arabicbert_xnli_finetuned BertForSequenceClassification from SarahAdnan +author: John Snow Labs +name: bert_base_arabicbert_xnli_finetuned +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_arabicbert_xnli_finetuned` is a English model originally trained by SarahAdnan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_arabicbert_xnli_finetuned_en_5.5.0_3.0_1727316538367.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_arabicbert_xnli_finetuned_en_5.5.0_3.0_1727316538367.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_arabicbert_xnli_finetuned","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_arabicbert_xnli_finetuned", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_arabicbert_xnli_finetuned| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.2 MB| + +## References + +https://huggingface.co/SarahAdnan/bert-base-arabicBERT-xnli-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_banking77_pt2_ahmadalsharef994_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_banking77_pt2_ahmadalsharef994_en.md new file mode 100644 index 00000000000000..ff21b750d0eb94 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_banking77_pt2_ahmadalsharef994_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_banking77_pt2_ahmadalsharef994 BertForSequenceClassification from ahmadalsharef994 +author: John Snow Labs +name: bert_base_banking77_pt2_ahmadalsharef994 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_banking77_pt2_ahmadalsharef994` is a English model originally trained by ahmadalsharef994. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_ahmadalsharef994_en_5.5.0_3.0_1727351168853.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_ahmadalsharef994_en_5.5.0_3.0_1727351168853.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_banking77_pt2_ahmadalsharef994","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_banking77_pt2_ahmadalsharef994", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_banking77_pt2_ahmadalsharef994| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/ahmadalsharef994/bert-base-banking77-pt2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_banking77_pt2_liu_xiang_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_banking77_pt2_liu_xiang_en.md new file mode 100644 index 00000000000000..6df4be847e5c00 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_banking77_pt2_liu_xiang_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_banking77_pt2_liu_xiang BertForSequenceClassification from Liu-Xiang +author: John Snow Labs +name: bert_base_banking77_pt2_liu_xiang +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_banking77_pt2_liu_xiang` is a English model originally trained by Liu-Xiang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_liu_xiang_en_5.5.0_3.0_1727342134440.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_liu_xiang_en_5.5.0_3.0_1727342134440.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_banking77_pt2_liu_xiang","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_banking77_pt2_liu_xiang", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_banking77_pt2_liu_xiang| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/Liu-Xiang/bert-base-banking77-pt2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_banking77_pt2_liu_xiang_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_banking77_pt2_liu_xiang_pipeline_en.md new file mode 100644 index 00000000000000..7e4fde083304d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_banking77_pt2_liu_xiang_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_banking77_pt2_liu_xiang_pipeline pipeline BertForSequenceClassification from Liu-Xiang +author: John Snow Labs +name: bert_base_banking77_pt2_liu_xiang_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_banking77_pt2_liu_xiang_pipeline` is a English model originally trained by Liu-Xiang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_liu_xiang_pipeline_en_5.5.0_3.0_1727342160802.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_liu_xiang_pipeline_en_5.5.0_3.0_1727342160802.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_banking77_pt2_liu_xiang_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_banking77_pt2_liu_xiang_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_banking77_pt2_liu_xiang_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/Liu-Xiang/bert-base-banking77-pt2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_banking77_pt2_model_bertforsequenceclassification_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_banking77_pt2_model_bertforsequenceclassification_pipeline_en.md new file mode 100644 index 00000000000000..11537dfad33ab0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_banking77_pt2_model_bertforsequenceclassification_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_banking77_pt2_model_bertforsequenceclassification_pipeline pipeline BertForSequenceClassification from psj0919 +author: John Snow Labs +name: bert_base_banking77_pt2_model_bertforsequenceclassification_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_banking77_pt2_model_bertforsequenceclassification_pipeline` is a English model originally trained by psj0919. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_model_bertforsequenceclassification_pipeline_en_5.5.0_3.0_1727338320710.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_model_bertforsequenceclassification_pipeline_en_5.5.0_3.0_1727338320710.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_banking77_pt2_model_bertforsequenceclassification_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_banking77_pt2_model_bertforsequenceclassification_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_banking77_pt2_model_bertforsequenceclassification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/psj0919/bert-base-banking77-pt2_model_BertForSequenceClassification + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_banking77_pt2_nakker_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_banking77_pt2_nakker_pipeline_en.md new file mode 100644 index 00000000000000..3ee29e60d8b943 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_banking77_pt2_nakker_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_banking77_pt2_nakker_pipeline pipeline BertForSequenceClassification from nakker +author: John Snow Labs +name: bert_base_banking77_pt2_nakker_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_banking77_pt2_nakker_pipeline` is a English model originally trained by nakker. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_nakker_pipeline_en_5.5.0_3.0_1727348109172.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_nakker_pipeline_en_5.5.0_3.0_1727348109172.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_banking77_pt2_nakker_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_banking77_pt2_nakker_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_banking77_pt2_nakker_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/nakker/bert-base-banking77-pt2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_banking77_pt2_nullzero_live_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_banking77_pt2_nullzero_live_pipeline_en.md new file mode 100644 index 00000000000000..2887cc5c05d640 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_banking77_pt2_nullzero_live_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_banking77_pt2_nullzero_live_pipeline pipeline BertForSequenceClassification from nullzero-live +author: John Snow Labs +name: bert_base_banking77_pt2_nullzero_live_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_banking77_pt2_nullzero_live_pipeline` is a English model originally trained by nullzero-live. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_nullzero_live_pipeline_en_5.5.0_3.0_1727350749359.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_nullzero_live_pipeline_en_5.5.0_3.0_1727350749359.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_banking77_pt2_nullzero_live_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_banking77_pt2_nullzero_live_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_banking77_pt2_nullzero_live_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/nullzero-live/bert-base-banking77-pt2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_banking77_pt2_sajjadamjad_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_banking77_pt2_sajjadamjad_pipeline_en.md new file mode 100644 index 00000000000000..e65c0ead439756 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_banking77_pt2_sajjadamjad_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_banking77_pt2_sajjadamjad_pipeline pipeline BertForSequenceClassification from sajjadamjad +author: John Snow Labs +name: bert_base_banking77_pt2_sajjadamjad_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_banking77_pt2_sajjadamjad_pipeline` is a English model originally trained by sajjadamjad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_sajjadamjad_pipeline_en_5.5.0_3.0_1727339690975.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_sajjadamjad_pipeline_en_5.5.0_3.0_1727339690975.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_banking77_pt2_sajjadamjad_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_banking77_pt2_sajjadamjad_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_banking77_pt2_sajjadamjad_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/sajjadamjad/bert-base-banking77-pt2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_banking77_t2_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_banking77_t2_en.md new file mode 100644 index 00000000000000..2a0d384117c188 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_banking77_t2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_banking77_t2 BertForSequenceClassification from YUCHUL +author: John Snow Labs +name: bert_base_banking77_t2 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_banking77_t2` is a English model originally trained by YUCHUL. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_banking77_t2_en_5.5.0_3.0_1727310643489.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_banking77_t2_en_5.5.0_3.0_1727310643489.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_banking77_t2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_banking77_t2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_banking77_t2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/YUCHUL/bert-base-banking77-t2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_banking77_t2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_banking77_t2_pipeline_en.md new file mode 100644 index 00000000000000..3b535e3e69c92c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_banking77_t2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_banking77_t2_pipeline pipeline BertForSequenceClassification from YUCHUL +author: John Snow Labs +name: bert_base_banking77_t2_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_banking77_t2_pipeline` is a English model originally trained by YUCHUL. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_banking77_t2_pipeline_en_5.5.0_3.0_1727310665675.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_banking77_t2_pipeline_en_5.5.0_3.0_1727310665675.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_banking77_t2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_banking77_t2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_banking77_t2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/YUCHUL/bert-base-banking77-t2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_5k_vul_hyp_exp_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_5k_vul_hyp_exp_pipeline_en.md new file mode 100644 index 00000000000000..92de1f48ad02b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_5k_vul_hyp_exp_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_cased_5k_vul_hyp_exp_pipeline pipeline BertForSequenceClassification from AbhishekkV19 +author: John Snow Labs +name: bert_base_cased_5k_vul_hyp_exp_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_5k_vul_hyp_exp_pipeline` is a English model originally trained by AbhishekkV19. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_5k_vul_hyp_exp_pipeline_en_5.5.0_3.0_1727352618828.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_5k_vul_hyp_exp_pipeline_en_5.5.0_3.0_1727352618828.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_cased_5k_vul_hyp_exp_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_cased_5k_vul_hyp_exp_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_5k_vul_hyp_exp_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/AbhishekkV19/bert-base-cased-5k-vul-hyp-exp + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_5kvul_10aug_3nsfw_10w_exp_10ep_s42_1_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_5kvul_10aug_3nsfw_10w_exp_10ep_s42_1_en.md new file mode 100644 index 00000000000000..bedcb0a98aa267 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_5kvul_10aug_3nsfw_10w_exp_10ep_s42_1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_cased_5kvul_10aug_3nsfw_10w_exp_10ep_s42_1 BertForSequenceClassification from AbhishekkV19 +author: John Snow Labs +name: bert_base_cased_5kvul_10aug_3nsfw_10w_exp_10ep_s42_1 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_5kvul_10aug_3nsfw_10w_exp_10ep_s42_1` is a English model originally trained by AbhishekkV19. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_5kvul_10aug_3nsfw_10w_exp_10ep_s42_1_en_5.5.0_3.0_1727351611055.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_5kvul_10aug_3nsfw_10w_exp_10ep_s42_1_en_5.5.0_3.0_1727351611055.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_5kvul_10aug_3nsfw_10w_exp_10ep_s42_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_5kvul_10aug_3nsfw_10w_exp_10ep_s42_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_5kvul_10aug_3nsfw_10w_exp_10ep_s42_1| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/AbhishekkV19/bert-base-cased-5kvul-10aug-3nsfw-10w-exp-10ep-s42-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_embed_mixup_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_embed_mixup_en.md new file mode 100644 index 00000000000000..9027823b2d57d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_embed_mixup_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_cased_embed_mixup BertForSequenceClassification from pa-shk +author: John Snow Labs +name: bert_base_cased_embed_mixup +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_embed_mixup` is a English model originally trained by pa-shk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_embed_mixup_en_5.5.0_3.0_1727344066300.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_embed_mixup_en_5.5.0_3.0_1727344066300.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_embed_mixup","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_embed_mixup", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_embed_mixup| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/pa-shk/bert-base-cased-embed-mixup \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_english_sentweet_profane_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_english_sentweet_profane_en.md new file mode 100644 index 00000000000000..703a0161444f1b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_english_sentweet_profane_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_cased_english_sentweet_profane BertForSequenceClassification from jayanta +author: John Snow Labs +name: bert_base_cased_english_sentweet_profane +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_english_sentweet_profane` is a English model originally trained by jayanta. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_english_sentweet_profane_en_5.5.0_3.0_1727350245669.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_english_sentweet_profane_en_5.5.0_3.0_1727350245669.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_english_sentweet_profane","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_english_sentweet_profane", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_english_sentweet_profane| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/jayanta/bert-base-cased-english-sentweet-Profane \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_english_sentweet_sentiment_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_english_sentweet_sentiment_pipeline_en.md new file mode 100644 index 00000000000000..da189c5fc0e25d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_english_sentweet_sentiment_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_cased_english_sentweet_sentiment_pipeline pipeline BertForSequenceClassification from jayanta +author: John Snow Labs +name: bert_base_cased_english_sentweet_sentiment_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_english_sentweet_sentiment_pipeline` is a English model originally trained by jayanta. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_english_sentweet_sentiment_pipeline_en_5.5.0_3.0_1727315205096.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_english_sentweet_sentiment_pipeline_en_5.5.0_3.0_1727315205096.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_cased_english_sentweet_sentiment_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_cased_english_sentweet_sentiment_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_english_sentweet_sentiment_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jayanta/bert-base-cased-english-sentweet-Sentiment + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_english_sentweet_targeted_insult_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_english_sentweet_targeted_insult_en.md new file mode 100644 index 00000000000000..51854d64f3b1fa --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_english_sentweet_targeted_insult_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_cased_english_sentweet_targeted_insult BertForSequenceClassification from jayanta +author: John Snow Labs +name: bert_base_cased_english_sentweet_targeted_insult +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_english_sentweet_targeted_insult` is a English model originally trained by jayanta. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_english_sentweet_targeted_insult_en_5.5.0_3.0_1727315543106.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_english_sentweet_targeted_insult_en_5.5.0_3.0_1727315543106.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_english_sentweet_targeted_insult","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_english_sentweet_targeted_insult", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_english_sentweet_targeted_insult| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/jayanta/bert-base-cased-english-sentweet-Targeted-Insult \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_finetuned_1_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_finetuned_1_en.md new file mode 100644 index 00000000000000..4a0d0f111430e3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_finetuned_1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_cased_finetuned_1 BertForSequenceClassification from sara-nabhani +author: John Snow Labs +name: bert_base_cased_finetuned_1 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_1` is a English model originally trained by sara-nabhani. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_1_en_5.5.0_3.0_1727319832621.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_1_en_5.5.0_3.0_1727319832621.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_finetuned_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_finetuned_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_1| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/sara-nabhani/bert-base-cased-finetuned-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_finetuned_filtered_0609_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_finetuned_filtered_0609_en.md new file mode 100644 index 00000000000000..e4015a394693d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_finetuned_filtered_0609_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_cased_finetuned_filtered_0609 BertForSequenceClassification from YeRyeongLee +author: John Snow Labs +name: bert_base_cased_finetuned_filtered_0609 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_filtered_0609` is a English model originally trained by YeRyeongLee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_filtered_0609_en_5.5.0_3.0_1727321870060.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_filtered_0609_en_5.5.0_3.0_1727321870060.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_finetuned_filtered_0609","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_finetuned_filtered_0609", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_filtered_0609| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/YeRyeongLee/bert-base-cased-finetuned-filtered-0609 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_finetuned_pan24_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_finetuned_pan24_en.md new file mode 100644 index 00000000000000..103d9d28eb0278 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_finetuned_pan24_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_cased_finetuned_pan24 BertForSequenceClassification from andricValdez +author: John Snow Labs +name: bert_base_cased_finetuned_pan24 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_pan24` is a English model originally trained by andricValdez. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_pan24_en_5.5.0_3.0_1727347198201.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_pan24_en_5.5.0_3.0_1727347198201.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_finetuned_pan24","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_finetuned_pan24", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_pan24| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/andricValdez/bert-base-cased-finetuned-pan24 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_finetuned_qqp_zyl1024_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_finetuned_qqp_zyl1024_en.md new file mode 100644 index 00000000000000..534bfd47aca6ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_finetuned_qqp_zyl1024_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_cased_finetuned_qqp_zyl1024 BertForSequenceClassification from zyl1024 +author: John Snow Labs +name: bert_base_cased_finetuned_qqp_zyl1024 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_qqp_zyl1024` is a English model originally trained by zyl1024. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_qqp_zyl1024_en_5.5.0_3.0_1727323515599.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_qqp_zyl1024_en_5.5.0_3.0_1727323515599.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_finetuned_qqp_zyl1024","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_finetuned_qqp_zyl1024", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_qqp_zyl1024| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/zyl1024/bert-base-cased-finetuned-qqp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_finetuned_rotten_tomatoes_epochs_2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_finetuned_rotten_tomatoes_epochs_2_pipeline_en.md new file mode 100644 index 00000000000000..18e2c693784257 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_finetuned_rotten_tomatoes_epochs_2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_cased_finetuned_rotten_tomatoes_epochs_2_pipeline pipeline BertForSequenceClassification from zebans +author: John Snow Labs +name: bert_base_cased_finetuned_rotten_tomatoes_epochs_2_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_rotten_tomatoes_epochs_2_pipeline` is a English model originally trained by zebans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_rotten_tomatoes_epochs_2_pipeline_en_5.5.0_3.0_1727343896901.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_rotten_tomatoes_epochs_2_pipeline_en_5.5.0_3.0_1727343896901.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_cased_finetuned_rotten_tomatoes_epochs_2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_cased_finetuned_rotten_tomatoes_epochs_2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_rotten_tomatoes_epochs_2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/zebans/bert-base-cased-finetuned-rotten-tomatoes-epochs-2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_finetuned_teachermomentsconfusion_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_finetuned_teachermomentsconfusion_pipeline_en.md new file mode 100644 index 00000000000000..227308ef3fd7cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_finetuned_teachermomentsconfusion_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_cased_finetuned_teachermomentsconfusion_pipeline pipeline BertForSequenceClassification from Ruborobot +author: John Snow Labs +name: bert_base_cased_finetuned_teachermomentsconfusion_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_teachermomentsconfusion_pipeline` is a English model originally trained by Ruborobot. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_teachermomentsconfusion_pipeline_en_5.5.0_3.0_1727319881232.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_teachermomentsconfusion_pipeline_en_5.5.0_3.0_1727319881232.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_cased_finetuned_teachermomentsconfusion_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_cased_finetuned_teachermomentsconfusion_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_teachermomentsconfusion_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/Ruborobot/bert-base-cased-finetuned-TeacherMomentsConfusion + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_ft5_3ep_s42_exp1_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_ft5_3ep_s42_exp1_en.md new file mode 100644 index 00000000000000..4e6e76965e4792 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_ft5_3ep_s42_exp1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_cased_ft5_3ep_s42_exp1 BertForSequenceClassification from AbhishekkV19 +author: John Snow Labs +name: bert_base_cased_ft5_3ep_s42_exp1 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_ft5_3ep_s42_exp1` is a English model originally trained by AbhishekkV19. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_ft5_3ep_s42_exp1_en_5.5.0_3.0_1727322730156.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_ft5_3ep_s42_exp1_en_5.5.0_3.0_1727322730156.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_ft5_3ep_s42_exp1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_ft5_3ep_s42_exp1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_ft5_3ep_s42_exp1| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/AbhishekkV19/bert-base-cased-ft5-3ep-s42-exp1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_ft5_6ep_s42_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_ft5_6ep_s42_en.md new file mode 100644 index 00000000000000..0d9ad92b9b821a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_ft5_6ep_s42_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_cased_ft5_6ep_s42 BertForSequenceClassification from AbhishekkV19 +author: John Snow Labs +name: bert_base_cased_ft5_6ep_s42 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_ft5_6ep_s42` is a English model originally trained by AbhishekkV19. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_ft5_6ep_s42_en_5.5.0_3.0_1727353091301.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_ft5_6ep_s42_en_5.5.0_3.0_1727353091301.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_ft5_6ep_s42","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_ft5_6ep_s42", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_ft5_6ep_s42| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/AbhishekkV19/bert-base-cased-ft5-6ep-s42 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_greecewildfire_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_greecewildfire_en.md new file mode 100644 index 00000000000000..531dca6c8700a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_greecewildfire_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_cased_greecewildfire BertForSequenceClassification from rizvi-rahil786 +author: John Snow Labs +name: bert_base_cased_greecewildfire +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_greecewildfire` is a English model originally trained by rizvi-rahil786. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_greecewildfire_en_5.5.0_3.0_1727319598522.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_greecewildfire_en_5.5.0_3.0_1727319598522.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_greecewildfire","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_greecewildfire", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_greecewildfire| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/rizvi-rahil786/bert-base-cased-greeceWildfire \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_hardaderail_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_hardaderail_pipeline_en.md new file mode 100644 index 00000000000000..6ec3da8d377590 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_hardaderail_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_cased_hardaderail_pipeline pipeline BertForSequenceClassification from rizvi-rahil786 +author: John Snow Labs +name: bert_base_cased_hardaderail_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_hardaderail_pipeline` is a English model originally trained by rizvi-rahil786. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_hardaderail_pipeline_en_5.5.0_3.0_1727350393358.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_hardaderail_pipeline_en_5.5.0_3.0_1727350393358.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_cased_hardaderail_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_cased_hardaderail_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_hardaderail_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/rizvi-rahil786/bert-base-cased-hardaDerail + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_lora_592k_snli_model1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_lora_592k_snli_model1_pipeline_en.md new file mode 100644 index 00000000000000..721ae98947a0da --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_lora_592k_snli_model1_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_cased_lora_592k_snli_model1_pipeline pipeline BertForSequenceClassification from varun-v-rao +author: John Snow Labs +name: bert_base_cased_lora_592k_snli_model1_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_lora_592k_snli_model1_pipeline` is a English model originally trained by varun-v-rao. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_lora_592k_snli_model1_pipeline_en_5.5.0_3.0_1727313794205.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_lora_592k_snli_model1_pipeline_en_5.5.0_3.0_1727313794205.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_cased_lora_592k_snli_model1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_cased_lora_592k_snli_model1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_lora_592k_snli_model1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/varun-v-rao/bert-base-cased-lora-592K-snli-model1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_mexicoquake_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_mexicoquake_pipeline_en.md new file mode 100644 index 00000000000000..5eedb1c981f046 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_mexicoquake_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_cased_mexicoquake_pipeline pipeline BertForSequenceClassification from rizvi-rahil786 +author: John Snow Labs +name: bert_base_cased_mexicoquake_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_mexicoquake_pipeline` is a English model originally trained by rizvi-rahil786. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_mexicoquake_pipeline_en_5.5.0_3.0_1727347374661.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_mexicoquake_pipeline_en_5.5.0_3.0_1727347374661.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_cased_mexicoquake_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_cased_mexicoquake_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_mexicoquake_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/rizvi-rahil786/bert-base-cased-mexicoQuake + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_mnli_model10_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_mnli_model10_pipeline_en.md new file mode 100644 index 00000000000000..24ce1406362ef3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_mnli_model10_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_cased_mnli_model10_pipeline pipeline BertForSequenceClassification from varun-v-rao +author: John Snow Labs +name: bert_base_cased_mnli_model10_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_mnli_model10_pipeline` is a English model originally trained by varun-v-rao. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_mnli_model10_pipeline_en_5.5.0_3.0_1727321921658.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_mnli_model10_pipeline_en_5.5.0_3.0_1727321921658.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_cased_mnli_model10_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_cased_mnli_model10_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_mnli_model10_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/varun-v-rao/bert-base-cased-mnli-model10 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_paraphrase_classification_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_paraphrase_classification_en.md new file mode 100644 index 00000000000000..e54a2676a34c07 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_paraphrase_classification_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_cased_paraphrase_classification BertForSequenceClassification from rushilJariwala +author: John Snow Labs +name: bert_base_cased_paraphrase_classification +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_paraphrase_classification` is a English model originally trained by rushilJariwala. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_paraphrase_classification_en_5.5.0_3.0_1727317895681.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_paraphrase_classification_en_5.5.0_3.0_1727317895681.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_paraphrase_classification","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_paraphrase_classification", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_paraphrase_classification| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/rushilJariwala/bert-base-cased-paraphrase-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_snli_model4_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_snli_model4_en.md new file mode 100644 index 00000000000000..9bb322569c99a5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_snli_model4_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_cased_snli_model4 BertForSequenceClassification from varun-v-rao +author: John Snow Labs +name: bert_base_cased_snli_model4 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_snli_model4` is a English model originally trained by varun-v-rao. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_snli_model4_en_5.5.0_3.0_1727345294451.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_snli_model4_en_5.5.0_3.0_1727345294451.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_snli_model4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_snli_model4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_snli_model4| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/varun-v-rao/bert-base-cased-snli-model4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_snli_model5_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_snli_model5_en.md new file mode 100644 index 00000000000000..29c5d6c2614d87 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_snli_model5_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_cased_snli_model5 BertForSequenceClassification from varun-v-rao +author: John Snow Labs +name: bert_base_cased_snli_model5 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_snli_model5` is a English model originally trained by varun-v-rao. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_snli_model5_en_5.5.0_3.0_1727349987667.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_snli_model5_en_5.5.0_3.0_1727349987667.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_snli_model5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_snli_model5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_snli_model5| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/varun-v-rao/bert-base-cased-snli-model5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_xuehangcang_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_xuehangcang_en.md new file mode 100644 index 00000000000000..8269e9b3ca9db7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_xuehangcang_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_cased_xuehangcang BertForSequenceClassification from XuehangCang +author: John Snow Labs +name: bert_base_cased_xuehangcang +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_xuehangcang` is a English model originally trained by XuehangCang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_xuehangcang_en_5.5.0_3.0_1727339171130.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_xuehangcang_en_5.5.0_3.0_1727339171130.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_xuehangcang","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_xuehangcang", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_xuehangcang| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/XuehangCang/bert-base-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_xuehangcang_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_xuehangcang_pipeline_en.md new file mode 100644 index 00000000000000..503537edc6bb9c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_cased_xuehangcang_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_cased_xuehangcang_pipeline pipeline BertForSequenceClassification from XuehangCang +author: John Snow Labs +name: bert_base_cased_xuehangcang_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_xuehangcang_pipeline` is a English model originally trained by XuehangCang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_xuehangcang_pipeline_en_5.5.0_3.0_1727339192699.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_xuehangcang_pipeline_en_5.5.0_3.0_1727339192699.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_cased_xuehangcang_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_cased_xuehangcang_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_xuehangcang_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/XuehangCang/bert-base-cased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_accidentreason_classifier_zh.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_accidentreason_classifier_zh.md new file mode 100644 index 00000000000000..c3a74e3500f290 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_accidentreason_classifier_zh.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Chinese bert_base_chinese_accidentreason_classifier BertForSequenceClassification from posie +author: John Snow Labs +name: bert_base_chinese_accidentreason_classifier +date: 2024-09-26 +tags: [zh, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: zh +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_accidentreason_classifier` is a Chinese model originally trained by posie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_accidentreason_classifier_zh_5.5.0_3.0_1727369473802.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_accidentreason_classifier_zh_5.5.0_3.0_1727369473802.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_accidentreason_classifier","zh") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_accidentreason_classifier", "zh") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_accidentreason_classifier| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|382.5 MB| + +## References + +https://huggingface.co/posie/bert-base-chinese-accidentreason-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_baidu_fintune_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_baidu_fintune_pipeline_en.md new file mode 100644 index 00000000000000..dbfae8b637baf5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_baidu_fintune_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_chinese_baidu_fintune_pipeline pipeline BertForSequenceClassification from rylai88 +author: John Snow Labs +name: bert_base_chinese_baidu_fintune_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_baidu_fintune_pipeline` is a English model originally trained by rylai88. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_baidu_fintune_pipeline_en_5.5.0_3.0_1727344667637.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_baidu_fintune_pipeline_en_5.5.0_3.0_1727344667637.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_chinese_baidu_fintune_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_chinese_baidu_fintune_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_baidu_fintune_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|383.3 MB| + +## References + +https://huggingface.co/rylai88/bert_base_chinese_baidu_fintune + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_climate_related_prediction_v4_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_climate_related_prediction_v4_en.md new file mode 100644 index 00000000000000..d6b32d515e949f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_climate_related_prediction_v4_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_chinese_climate_related_prediction_v4 BertForSequenceClassification from hw2942 +author: John Snow Labs +name: bert_base_chinese_climate_related_prediction_v4 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_climate_related_prediction_v4` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_climate_related_prediction_v4_en_5.5.0_3.0_1727317766019.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_climate_related_prediction_v4_en_5.5.0_3.0_1727317766019.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_climate_related_prediction_v4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_climate_related_prediction_v4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_climate_related_prediction_v4| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.3 MB| + +## References + +https://huggingface.co/hw2942/bert-base-chinese-climate-related-prediction-v4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_climate_related_prediction_v4_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_climate_related_prediction_v4_pipeline_en.md new file mode 100644 index 00000000000000..b28741bcb94393 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_climate_related_prediction_v4_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_chinese_climate_related_prediction_v4_pipeline pipeline BertForSequenceClassification from hw2942 +author: John Snow Labs +name: bert_base_chinese_climate_related_prediction_v4_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_climate_related_prediction_v4_pipeline` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_climate_related_prediction_v4_pipeline_en_5.5.0_3.0_1727317785925.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_climate_related_prediction_v4_pipeline_en_5.5.0_3.0_1727317785925.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_chinese_climate_related_prediction_v4_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_chinese_climate_related_prediction_v4_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_climate_related_prediction_v4_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|383.3 MB| + +## References + +https://huggingface.co/hw2942/bert-base-chinese-climate-related-prediction-v4 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_climate_related_prediction_v6_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_climate_related_prediction_v6_en.md new file mode 100644 index 00000000000000..5347bb5c597443 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_climate_related_prediction_v6_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_chinese_climate_related_prediction_v6 BertForSequenceClassification from hw2942 +author: John Snow Labs +name: bert_base_chinese_climate_related_prediction_v6 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_climate_related_prediction_v6` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_climate_related_prediction_v6_en_5.5.0_3.0_1727311668128.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_climate_related_prediction_v6_en_5.5.0_3.0_1727311668128.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_climate_related_prediction_v6","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_climate_related_prediction_v6", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_climate_related_prediction_v6| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.3 MB| + +## References + +https://huggingface.co/hw2942/bert-base-chinese-climate-related-prediction-v6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_climate_related_prediction_vv3_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_climate_related_prediction_vv3_en.md new file mode 100644 index 00000000000000..755d5e6adeb314 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_climate_related_prediction_vv3_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_chinese_climate_related_prediction_vv3 BertForSequenceClassification from hw2942 +author: John Snow Labs +name: bert_base_chinese_climate_related_prediction_vv3 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_climate_related_prediction_vv3` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_climate_related_prediction_vv3_en_5.5.0_3.0_1727349202082.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_climate_related_prediction_vv3_en_5.5.0_3.0_1727349202082.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_climate_related_prediction_vv3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_climate_related_prediction_vv3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_climate_related_prediction_vv3| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.3 MB| + +## References + +https://huggingface.co/hw2942/bert-base-chinese-climate-related-prediction-vv3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_climate_risk_opportunity_prediction_vv4_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_climate_risk_opportunity_prediction_vv4_en.md new file mode 100644 index 00000000000000..e6b694fff3c08e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_climate_risk_opportunity_prediction_vv4_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_chinese_climate_risk_opportunity_prediction_vv4 BertForSequenceClassification from hw2942 +author: John Snow Labs +name: bert_base_chinese_climate_risk_opportunity_prediction_vv4 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_climate_risk_opportunity_prediction_vv4` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_climate_risk_opportunity_prediction_vv4_en_5.5.0_3.0_1727353090424.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_climate_risk_opportunity_prediction_vv4_en_5.5.0_3.0_1727353090424.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_climate_risk_opportunity_prediction_vv4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_climate_risk_opportunity_prediction_vv4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_climate_risk_opportunity_prediction_vv4| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.3 MB| + +## References + +https://huggingface.co/hw2942/bert-base-chinese-climate-risk-opportunity-prediction-vv4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_climate_risk_opportunity_prediction_vv4_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_climate_risk_opportunity_prediction_vv4_pipeline_en.md new file mode 100644 index 00000000000000..563794d9ee29c6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_climate_risk_opportunity_prediction_vv4_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_chinese_climate_risk_opportunity_prediction_vv4_pipeline pipeline BertForSequenceClassification from hw2942 +author: John Snow Labs +name: bert_base_chinese_climate_risk_opportunity_prediction_vv4_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_climate_risk_opportunity_prediction_vv4_pipeline` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_climate_risk_opportunity_prediction_vv4_pipeline_en_5.5.0_3.0_1727353120057.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_climate_risk_opportunity_prediction_vv4_pipeline_en_5.5.0_3.0_1727353120057.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_chinese_climate_risk_opportunity_prediction_vv4_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_chinese_climate_risk_opportunity_prediction_vv4_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_climate_risk_opportunity_prediction_vv4_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|383.3 MB| + +## References + +https://huggingface.co/hw2942/bert-base-chinese-climate-risk-opportunity-prediction-vv4 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_climate_transition_physical_risk_prediction_6_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_climate_transition_physical_risk_prediction_6_en.md new file mode 100644 index 00000000000000..d98d7aae836d63 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_climate_transition_physical_risk_prediction_6_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_chinese_climate_transition_physical_risk_prediction_6 BertForSequenceClassification from hw2942 +author: John Snow Labs +name: bert_base_chinese_climate_transition_physical_risk_prediction_6 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_climate_transition_physical_risk_prediction_6` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_climate_transition_physical_risk_prediction_6_en_5.5.0_3.0_1727354602016.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_climate_transition_physical_risk_prediction_6_en_5.5.0_3.0_1727354602016.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_climate_transition_physical_risk_prediction_6","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_climate_transition_physical_risk_prediction_6", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_climate_transition_physical_risk_prediction_6| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.3 MB| + +## References + +https://huggingface.co/hw2942/bert-base-chinese-climate-transition-physical-risk-prediction-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_climate_transition_physical_risk_prediction_6_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_climate_transition_physical_risk_prediction_6_pipeline_en.md new file mode 100644 index 00000000000000..489fa01526f051 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_climate_transition_physical_risk_prediction_6_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_chinese_climate_transition_physical_risk_prediction_6_pipeline pipeline BertForSequenceClassification from hw2942 +author: John Snow Labs +name: bert_base_chinese_climate_transition_physical_risk_prediction_6_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_climate_transition_physical_risk_prediction_6_pipeline` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_climate_transition_physical_risk_prediction_6_pipeline_en_5.5.0_3.0_1727354620924.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_climate_transition_physical_risk_prediction_6_pipeline_en_5.5.0_3.0_1727354620924.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_chinese_climate_transition_physical_risk_prediction_6_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_chinese_climate_transition_physical_risk_prediction_6_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_climate_transition_physical_risk_prediction_6_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|383.3 MB| + +## References + +https://huggingface.co/hw2942/bert-base-chinese-climate-transition-physical-risk-prediction-6 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_en.md new file mode 100644 index 00000000000000..c0b95f8b0b4135 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_chinese BertForSequenceClassification from watsonpro +author: John Snow Labs +name: bert_base_chinese +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese` is a English model originally trained by watsonpro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_en_5.5.0_3.0_1727356554191.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_en_5.5.0_3.0_1727356554191.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|39.8 MB| + +## References + +https://huggingface.co/watsonpro/bert-base-chinese \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_finetuned_intent_recognition_biomedical_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_finetuned_intent_recognition_biomedical_en.md new file mode 100644 index 00000000000000..0a999b753455e0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_finetuned_intent_recognition_biomedical_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_chinese_finetuned_intent_recognition_biomedical BertForSequenceClassification from nlp-guild +author: John Snow Labs +name: bert_base_chinese_finetuned_intent_recognition_biomedical +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_finetuned_intent_recognition_biomedical` is a English model originally trained by nlp-guild. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_finetuned_intent_recognition_biomedical_en_5.5.0_3.0_1727365704529.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_finetuned_intent_recognition_biomedical_en_5.5.0_3.0_1727365704529.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_finetuned_intent_recognition_biomedical","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_finetuned_intent_recognition_biomedical", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_finetuned_intent_recognition_biomedical| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.3 MB| + +## References + +https://huggingface.co/nlp-guild/bert-base-chinese-finetuned-intent_recognition-biomedical \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_finetuning_financial_news_sentiment_test_zh.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_finetuning_financial_news_sentiment_test_zh.md new file mode 100644 index 00000000000000..1766ea3b9e5f26 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_finetuning_financial_news_sentiment_test_zh.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Chinese bert_base_chinese_finetuning_financial_news_sentiment_test BertForSequenceClassification from hw2942 +author: John Snow Labs +name: bert_base_chinese_finetuning_financial_news_sentiment_test +date: 2024-09-26 +tags: [zh, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: zh +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_finetuning_financial_news_sentiment_test` is a Chinese model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_finetuning_financial_news_sentiment_test_zh_5.5.0_3.0_1727358901399.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_finetuning_financial_news_sentiment_test_zh_5.5.0_3.0_1727358901399.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_finetuning_financial_news_sentiment_test","zh") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_finetuning_financial_news_sentiment_test", "zh") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_finetuning_financial_news_sentiment_test| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|383.3 MB| + +## References + +https://huggingface.co/hw2942/bert-base-chinese-finetuning-financial-news-sentiment-test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_finetuning_wallstreetcn_morning_news_vix_sz50_v1_zh.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_finetuning_wallstreetcn_morning_news_vix_sz50_v1_zh.md new file mode 100644 index 00000000000000..9cedab95790e38 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_finetuning_wallstreetcn_morning_news_vix_sz50_v1_zh.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Chinese bert_base_chinese_finetuning_wallstreetcn_morning_news_vix_sz50_v1 BertForSequenceClassification from hw2942 +author: John Snow Labs +name: bert_base_chinese_finetuning_wallstreetcn_morning_news_vix_sz50_v1 +date: 2024-09-26 +tags: [zh, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: zh +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_finetuning_wallstreetcn_morning_news_vix_sz50_v1` is a Chinese model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_finetuning_wallstreetcn_morning_news_vix_sz50_v1_zh_5.5.0_3.0_1727342576784.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_finetuning_wallstreetcn_morning_news_vix_sz50_v1_zh_5.5.0_3.0_1727342576784.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_finetuning_wallstreetcn_morning_news_vix_sz50_v1","zh") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_finetuning_wallstreetcn_morning_news_vix_sz50_v1", "zh") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_finetuning_wallstreetcn_morning_news_vix_sz50_v1| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|383.3 MB| + +## References + +https://huggingface.co/hw2942/bert-base-chinese-finetuning-wallstreetcn-morning-news-vix-sz50-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_pipeline_en.md new file mode 100644 index 00000000000000..5cf5ed0b02cf85 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_chinese_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_chinese_pipeline pipeline BertForSequenceClassification from watsonpro +author: John Snow Labs +name: bert_base_chinese_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_pipeline` is a English model originally trained by watsonpro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_pipeline_en_5.5.0_3.0_1727356556708.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_pipeline_en_5.5.0_3.0_1727356556708.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_chinese_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_chinese_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|39.8 MB| + +## References + +https://huggingface.co/watsonpro/bert-base-chinese + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_climate_fever_fixed_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_climate_fever_fixed_en.md new file mode 100644 index 00000000000000..53152fa198ffc3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_climate_fever_fixed_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_climate_fever_fixed BertForSequenceClassification from rexarski +author: John Snow Labs +name: bert_base_climate_fever_fixed +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_climate_fever_fixed` is a English model originally trained by rexarski. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_climate_fever_fixed_en_5.5.0_3.0_1727325877674.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_climate_fever_fixed_en_5.5.0_3.0_1727325877674.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_climate_fever_fixed","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_climate_fever_fixed", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_climate_fever_fixed| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/rexarski/bert-base-climate-fever-fixed \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_climate_fever_fixed_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_climate_fever_fixed_pipeline_en.md new file mode 100644 index 00000000000000..2dc3240621d15d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_climate_fever_fixed_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_climate_fever_fixed_pipeline pipeline BertForSequenceClassification from rexarski +author: John Snow Labs +name: bert_base_climate_fever_fixed_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_climate_fever_fixed_pipeline` is a English model originally trained by rexarski. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_climate_fever_fixed_pipeline_en_5.5.0_3.0_1727325899446.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_climate_fever_fixed_pipeline_en_5.5.0_3.0_1727325899446.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_climate_fever_fixed_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_climate_fever_fixed_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_climate_fever_fixed_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/rexarski/bert-base-climate-fever-fixed + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_daichi_support_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_daichi_support_pipeline_en.md new file mode 100644 index 00000000000000..9015bf139ae8a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_daichi_support_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_daichi_support_pipeline pipeline BertForSequenceClassification from DonMakar +author: John Snow Labs +name: bert_base_daichi_support_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_daichi_support_pipeline` is a English model originally trained by DonMakar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_daichi_support_pipeline_en_5.5.0_3.0_1727346222371.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_daichi_support_pipeline_en_5.5.0_3.0_1727346222371.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_daichi_support_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_daichi_support_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_daichi_support_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/DonMakar/bert-base-Daichi_support + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_emotion_24_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_emotion_24_en.md new file mode 100644 index 00000000000000..50376ecbe6f045 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_emotion_24_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_emotion_24 BertForSequenceClassification from gokuls +author: John Snow Labs +name: bert_base_emotion_24 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_emotion_24` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_emotion_24_en_5.5.0_3.0_1727320608757.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_emotion_24_en_5.5.0_3.0_1727320608757.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_emotion_24","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_emotion_24", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_emotion_24| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.2 MB| + +## References + +https://huggingface.co/gokuls/bert-base-emotion_24 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_emotion_24_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_emotion_24_pipeline_en.md new file mode 100644 index 00000000000000..5a2fc263628568 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_emotion_24_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_emotion_24_pipeline pipeline BertForSequenceClassification from gokuls +author: John Snow Labs +name: bert_base_emotion_24_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_emotion_24_pipeline` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_emotion_24_pipeline_en_5.5.0_3.0_1727320630397.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_emotion_24_pipeline_en_5.5.0_3.0_1727320630397.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_emotion_24_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_emotion_24_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_emotion_24_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|410.2 MB| + +## References + +https://huggingface.co/gokuls/bert-base-emotion_24 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_fine_tuned_text_classificarion_ds_dropout_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_fine_tuned_text_classificarion_ds_dropout_en.md new file mode 100644 index 00000000000000..4d657f58c90bee --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_fine_tuned_text_classificarion_ds_dropout_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_fine_tuned_text_classificarion_ds_dropout BertForSequenceClassification from Sleoruiz +author: John Snow Labs +name: bert_base_fine_tuned_text_classificarion_ds_dropout +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_fine_tuned_text_classificarion_ds_dropout` is a English model originally trained by Sleoruiz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_fine_tuned_text_classificarion_ds_dropout_en_5.5.0_3.0_1727353881243.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_fine_tuned_text_classificarion_ds_dropout_en_5.5.0_3.0_1727353881243.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_fine_tuned_text_classificarion_ds_dropout","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_fine_tuned_text_classificarion_ds_dropout", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_fine_tuned_text_classificarion_ds_dropout| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.4 MB| + +## References + +https://huggingface.co/Sleoruiz/bert-base-fine-tuned-text-classificarion-ds-dropout \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_finetuned_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_finetuned_en.md new file mode 100644 index 00000000000000..b508509faad48f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_finetuned_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_finetuned BertForSequenceClassification from kagented +author: John Snow Labs +name: bert_base_finetuned +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_finetuned` is a English model originally trained by kagented. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_en_5.5.0_3.0_1727319588133.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_en_5.5.0_3.0_1727319588133.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_finetuned","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_finetuned", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_finetuned| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|415.2 MB| + +## References + +https://huggingface.co/kagented/bert-base-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_finetuned_lcqmc_chinese_pipeline_zh.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_finetuned_lcqmc_chinese_pipeline_zh.md new file mode 100644 index 00000000000000..78fcef7b966349 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_finetuned_lcqmc_chinese_pipeline_zh.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Chinese bert_base_finetuned_lcqmc_chinese_pipeline pipeline BertForSequenceClassification from WangA +author: John Snow Labs +name: bert_base_finetuned_lcqmc_chinese_pipeline +date: 2024-09-26 +tags: [zh, open_source, pipeline, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_finetuned_lcqmc_chinese_pipeline` is a Chinese model originally trained by WangA. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_lcqmc_chinese_pipeline_zh_5.5.0_3.0_1727353268948.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_lcqmc_chinese_pipeline_zh_5.5.0_3.0_1727353268948.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_finetuned_lcqmc_chinese_pipeline", lang = "zh") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_finetuned_lcqmc_chinese_pipeline", lang = "zh") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_finetuned_lcqmc_chinese_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|zh| +|Size:|383.3 MB| + +## References + +https://huggingface.co/WangA/bert-base-finetuned-lcqmc-chinese + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_finetuned_lcqmc_chinese_zh.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_finetuned_lcqmc_chinese_zh.md new file mode 100644 index 00000000000000..1f7a379aadfcb6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_finetuned_lcqmc_chinese_zh.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Chinese bert_base_finetuned_lcqmc_chinese BertForSequenceClassification from WangA +author: John Snow Labs +name: bert_base_finetuned_lcqmc_chinese +date: 2024-09-26 +tags: [zh, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: zh +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_finetuned_lcqmc_chinese` is a Chinese model originally trained by WangA. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_lcqmc_chinese_zh_5.5.0_3.0_1727353249448.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_lcqmc_chinese_zh_5.5.0_3.0_1727353249448.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_finetuned_lcqmc_chinese","zh") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_finetuned_lcqmc_chinese", "zh") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_finetuned_lcqmc_chinese| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|383.3 MB| + +## References + +https://huggingface.co/WangA/bert-base-finetuned-lcqmc-chinese \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_finetuned_sts_cloudblack_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_finetuned_sts_cloudblack_en.md new file mode 100644 index 00000000000000..64cac99297938d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_finetuned_sts_cloudblack_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_finetuned_sts_cloudblack BertForSequenceClassification from cloudblack +author: John Snow Labs +name: bert_base_finetuned_sts_cloudblack +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_finetuned_sts_cloudblack` is a English model originally trained by cloudblack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_sts_cloudblack_en_5.5.0_3.0_1727346220226.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_sts_cloudblack_en_5.5.0_3.0_1727346220226.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_finetuned_sts_cloudblack","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_finetuned_sts_cloudblack", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_finetuned_sts_cloudblack| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/cloudblack/bert-base-finetuned-sts \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_finetuned_sts_cloudblack_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_finetuned_sts_cloudblack_pipeline_en.md new file mode 100644 index 00000000000000..d5a3f168e93957 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_finetuned_sts_cloudblack_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_finetuned_sts_cloudblack_pipeline pipeline BertForSequenceClassification from cloudblack +author: John Snow Labs +name: bert_base_finetuned_sts_cloudblack_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_finetuned_sts_cloudblack_pipeline` is a English model originally trained by cloudblack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_sts_cloudblack_pipeline_en_5.5.0_3.0_1727346244818.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_sts_cloudblack_pipeline_en_5.5.0_3.0_1727346244818.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_finetuned_sts_cloudblack_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_finetuned_sts_cloudblack_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_finetuned_sts_cloudblack_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|414.7 MB| + +## References + +https://huggingface.co/cloudblack/bert-base-finetuned-sts + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_finetuned_sts_deprecated_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_finetuned_sts_deprecated_en.md new file mode 100644 index 00000000000000..6c5c07bb502297 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_finetuned_sts_deprecated_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_finetuned_sts_deprecated BertForSequenceClassification from eliza-dukim +author: John Snow Labs +name: bert_base_finetuned_sts_deprecated +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_finetuned_sts_deprecated` is a English model originally trained by eliza-dukim. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_sts_deprecated_en_5.5.0_3.0_1727311983812.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_sts_deprecated_en_5.5.0_3.0_1727311983812.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_finetuned_sts_deprecated","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_finetuned_sts_deprecated", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_finetuned_sts_deprecated| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/eliza-dukim/bert-base-finetuned-sts-deprecated \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_finetuned_sts_deprecated_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_finetuned_sts_deprecated_pipeline_en.md new file mode 100644 index 00000000000000..43b1dcaa1ec27e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_finetuned_sts_deprecated_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_finetuned_sts_deprecated_pipeline pipeline BertForSequenceClassification from eliza-dukim +author: John Snow Labs +name: bert_base_finetuned_sts_deprecated_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_finetuned_sts_deprecated_pipeline` is a English model originally trained by eliza-dukim. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_sts_deprecated_pipeline_en_5.5.0_3.0_1727312008851.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_sts_deprecated_pipeline_en_5.5.0_3.0_1727312008851.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_finetuned_sts_deprecated_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_finetuned_sts_deprecated_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_finetuned_sts_deprecated_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|414.7 MB| + +## References + +https://huggingface.co/eliza-dukim/bert-base-finetuned-sts-deprecated + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_finetuned_sts_ezre_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_finetuned_sts_ezre_en.md new file mode 100644 index 00000000000000..9c14c68d74e34d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_finetuned_sts_ezre_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_finetuned_sts_ezre BertForSequenceClassification from Ezre +author: John Snow Labs +name: bert_base_finetuned_sts_ezre +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_finetuned_sts_ezre` is a English model originally trained by Ezre. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_sts_ezre_en_5.5.0_3.0_1727355061443.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_sts_ezre_en_5.5.0_3.0_1727355061443.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_finetuned_sts_ezre","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_finetuned_sts_ezre", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_finetuned_sts_ezre| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/Ezre/bert-base-finetuned-sts \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_finetuned_sts_ezre_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_finetuned_sts_ezre_pipeline_en.md new file mode 100644 index 00000000000000..7ff64344345a51 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_finetuned_sts_ezre_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_finetuned_sts_ezre_pipeline pipeline BertForSequenceClassification from Ezre +author: John Snow Labs +name: bert_base_finetuned_sts_ezre_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_finetuned_sts_ezre_pipeline` is a English model originally trained by Ezre. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_sts_ezre_pipeline_en_5.5.0_3.0_1727355085254.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_sts_ezre_pipeline_en_5.5.0_3.0_1727355085254.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_finetuned_sts_ezre_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_finetuned_sts_ezre_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_finetuned_sts_ezre_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|414.7 MB| + +## References + +https://huggingface.co/Ezre/bert-base-finetuned-sts + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_german_cased_hatespeech_germeval18_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_german_cased_hatespeech_germeval18_pipeline_en.md new file mode 100644 index 00000000000000..5f617fc9d3b325 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_german_cased_hatespeech_germeval18_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_german_cased_hatespeech_germeval18_pipeline pipeline BertForSequenceClassification from CundK +author: John Snow Labs +name: bert_base_german_cased_hatespeech_germeval18_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_german_cased_hatespeech_germeval18_pipeline` is a English model originally trained by CundK. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_german_cased_hatespeech_germeval18_pipeline_en_5.5.0_3.0_1727369847316.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_german_cased_hatespeech_germeval18_pipeline_en_5.5.0_3.0_1727369847316.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_german_cased_hatespeech_germeval18_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_german_cased_hatespeech_germeval18_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_german_cased_hatespeech_germeval18_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.1 MB| + +## References + +https://huggingface.co/CundK/bert-base-german-cased-hatespeech-GermEval18 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_indonesian_1_5g_sentiment_analysis_smsa_tuning_aliiil02_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_indonesian_1_5g_sentiment_analysis_smsa_tuning_aliiil02_en.md new file mode 100644 index 00000000000000..9a1bee2ec38591 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_indonesian_1_5g_sentiment_analysis_smsa_tuning_aliiil02_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_indonesian_1_5g_sentiment_analysis_smsa_tuning_aliiil02 BertForSequenceClassification from aliiil02 +author: John Snow Labs +name: bert_base_indonesian_1_5g_sentiment_analysis_smsa_tuning_aliiil02 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_indonesian_1_5g_sentiment_analysis_smsa_tuning_aliiil02` is a English model originally trained by aliiil02. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_indonesian_1_5g_sentiment_analysis_smsa_tuning_aliiil02_en_5.5.0_3.0_1727312193144.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_indonesian_1_5g_sentiment_analysis_smsa_tuning_aliiil02_en_5.5.0_3.0_1727312193144.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_indonesian_1_5g_sentiment_analysis_smsa_tuning_aliiil02","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_indonesian_1_5g_sentiment_analysis_smsa_tuning_aliiil02", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_indonesian_1_5g_sentiment_analysis_smsa_tuning_aliiil02| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.8 MB| + +## References + +https://huggingface.co/aliiil02/bert-base-indonesian-1.5G-sentiment-analysis-smsa-tuning \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_indonesian_1_5g_sentiment_analysis_smsa_tuning_rahmaabusalma_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_indonesian_1_5g_sentiment_analysis_smsa_tuning_rahmaabusalma_pipeline_en.md new file mode 100644 index 00000000000000..9ea874f7c1f07b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_indonesian_1_5g_sentiment_analysis_smsa_tuning_rahmaabusalma_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_indonesian_1_5g_sentiment_analysis_smsa_tuning_rahmaabusalma_pipeline pipeline BertForSequenceClassification from rahmaabusalma +author: John Snow Labs +name: bert_base_indonesian_1_5g_sentiment_analysis_smsa_tuning_rahmaabusalma_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_indonesian_1_5g_sentiment_analysis_smsa_tuning_rahmaabusalma_pipeline` is a English model originally trained by rahmaabusalma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_indonesian_1_5g_sentiment_analysis_smsa_tuning_rahmaabusalma_pipeline_en_5.5.0_3.0_1727345183912.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_indonesian_1_5g_sentiment_analysis_smsa_tuning_rahmaabusalma_pipeline_en_5.5.0_3.0_1727345183912.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_indonesian_1_5g_sentiment_analysis_smsa_tuning_rahmaabusalma_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_indonesian_1_5g_sentiment_analysis_smsa_tuning_rahmaabusalma_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_indonesian_1_5g_sentiment_analysis_smsa_tuning_rahmaabusalma_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|414.8 MB| + +## References + +https://huggingface.co/rahmaabusalma/bert-base-indonesian-1.5G-sentiment-analysis-smsa-tuning + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_massive_intent_24_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_massive_intent_24_en.md new file mode 100644 index 00000000000000..c994ca310f1ce9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_massive_intent_24_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_massive_intent_24 BertForSequenceClassification from gokuls +author: John Snow Labs +name: bert_base_massive_intent_24 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_massive_intent_24` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_massive_intent_24_en_5.5.0_3.0_1727318855186.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_massive_intent_24_en_5.5.0_3.0_1727318855186.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_massive_intent_24","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_massive_intent_24", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_massive_intent_24| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.3 MB| + +## References + +https://huggingface.co/gokuls/bert-base-Massive-intent_24 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_02_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_02_pipeline_xx.md new file mode 100644 index 00000000000000..da1d3bb0326a04 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_02_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_02_pipeline pipeline BertForSequenceClassification from TiagoSanti +author: John Snow Labs +name: bert_base_multilingual_cased_02_pipeline +date: 2024-09-26 +tags: [xx, open_source, pipeline, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_02_pipeline` is a Multilingual model originally trained by TiagoSanti. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_02_pipeline_xx_5.5.0_3.0_1727322634024.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_02_pipeline_xx_5.5.0_3.0_1727322634024.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_multilingual_cased_02_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_multilingual_cased_02_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_02_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/TiagoSanti/bert-base-multilingual-cased-02 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_02_xx.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_02_xx.md new file mode 100644 index 00000000000000..40cdc95cd8be61 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_02_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_02 BertForSequenceClassification from TiagoSanti +author: John Snow Labs +name: bert_base_multilingual_cased_02 +date: 2024-09-26 +tags: [xx, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_02` is a Multilingual model originally trained by TiagoSanti. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_02_xx_5.5.0_3.0_1727322599959.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_02_xx_5.5.0_3.0_1727322599959.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_02","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_02", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_02| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/TiagoSanti/bert-base-multilingual-cased-02 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_brenomatos_xx.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_brenomatos_xx.md new file mode 100644 index 00000000000000..b759dc4005d31f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_brenomatos_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_brenomatos BertForSequenceClassification from brenomatos +author: John Snow Labs +name: bert_base_multilingual_cased_brenomatos +date: 2024-09-26 +tags: [xx, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_brenomatos` is a Multilingual model originally trained by brenomatos. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_brenomatos_xx_5.5.0_3.0_1727320676083.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_brenomatos_xx_5.5.0_3.0_1727320676083.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_brenomatos","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_brenomatos", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_brenomatos| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/brenomatos/bert-base-multilingual-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_caste_hatespech_ltedi_mbert_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_caste_hatespech_ltedi_mbert_pipeline_xx.md new file mode 100644 index 00000000000000..3f1776db52d42e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_caste_hatespech_ltedi_mbert_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_caste_hatespech_ltedi_mbert_pipeline pipeline BertForSequenceClassification from mdosama39 +author: John Snow Labs +name: bert_base_multilingual_cased_caste_hatespech_ltedi_mbert_pipeline +date: 2024-09-26 +tags: [xx, open_source, pipeline, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_caste_hatespech_ltedi_mbert_pipeline` is a Multilingual model originally trained by mdosama39. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_caste_hatespech_ltedi_mbert_pipeline_xx_5.5.0_3.0_1727321442416.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_caste_hatespech_ltedi_mbert_pipeline_xx_5.5.0_3.0_1727321442416.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_multilingual_cased_caste_hatespech_ltedi_mbert_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_multilingual_cased_caste_hatespech_ltedi_mbert_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_caste_hatespech_ltedi_mbert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/mdosama39/bert-base-multilingual-cased-Caste-HateSpech_LTEDi-mBert + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_finetuned_anxoanxo_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_finetuned_anxoanxo_pipeline_xx.md new file mode 100644 index 00000000000000..9ad5dadb4024f4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_finetuned_anxoanxo_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_finetuned_anxoanxo_pipeline pipeline BertForSequenceClassification from anxoanxo +author: John Snow Labs +name: bert_base_multilingual_cased_finetuned_anxoanxo_pipeline +date: 2024-09-26 +tags: [xx, open_source, pipeline, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_finetuned_anxoanxo_pipeline` is a Multilingual model originally trained by anxoanxo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_anxoanxo_pipeline_xx_5.5.0_3.0_1727316198978.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_anxoanxo_pipeline_xx_5.5.0_3.0_1727316198978.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_multilingual_cased_finetuned_anxoanxo_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_multilingual_cased_finetuned_anxoanxo_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_finetuned_anxoanxo_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/anxoanxo/bert-base-multilingual-cased-finetuned + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_finetuned_anxoanxo_xx.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_finetuned_anxoanxo_xx.md new file mode 100644 index 00000000000000..3ddc2470403ce5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_finetuned_anxoanxo_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_finetuned_anxoanxo BertForSequenceClassification from anxoanxo +author: John Snow Labs +name: bert_base_multilingual_cased_finetuned_anxoanxo +date: 2024-09-26 +tags: [xx, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_finetuned_anxoanxo` is a Multilingual model originally trained by anxoanxo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_anxoanxo_xx_5.5.0_3.0_1727316164653.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_anxoanxo_xx_5.5.0_3.0_1727316164653.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_finetuned_anxoanxo","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_finetuned_anxoanxo", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_finetuned_anxoanxo| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/anxoanxo/bert-base-multilingual-cased-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_finetuned_multilingual_nli_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_finetuned_multilingual_nli_pipeline_xx.md new file mode 100644 index 00000000000000..5866149dfb816d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_finetuned_multilingual_nli_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_finetuned_multilingual_nli_pipeline pipeline BertForSequenceClassification from MayaGalvez +author: John Snow Labs +name: bert_base_multilingual_cased_finetuned_multilingual_nli_pipeline +date: 2024-09-26 +tags: [xx, open_source, pipeline, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_finetuned_multilingual_nli_pipeline` is a Multilingual model originally trained by MayaGalvez. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_multilingual_nli_pipeline_xx_5.5.0_3.0_1727335044602.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_multilingual_nli_pipeline_xx_5.5.0_3.0_1727335044602.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_multilingual_cased_finetuned_multilingual_nli_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_multilingual_cased_finetuned_multilingual_nli_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_finetuned_multilingual_nli_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/MayaGalvez/bert-base-multilingual-cased-finetuned-multilingual-nli + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_mnli_100_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_mnli_100_pipeline_xx.md new file mode 100644 index 00000000000000..a4c29976a18271 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_mnli_100_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_mnli_100_pipeline pipeline BertForSequenceClassification from tmnam20 +author: John Snow Labs +name: bert_base_multilingual_cased_mnli_100_pipeline +date: 2024-09-26 +tags: [xx, open_source, pipeline, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_mnli_100_pipeline` is a Multilingual model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_mnli_100_pipeline_xx_5.5.0_3.0_1727336044976.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_mnli_100_pipeline_xx_5.5.0_3.0_1727336044976.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_multilingual_cased_mnli_100_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_multilingual_cased_mnli_100_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_mnli_100_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/tmnam20/bert-base-multilingual-cased-mnli-100 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_mnli_100_xx.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_mnli_100_xx.md new file mode 100644 index 00000000000000..0e81981b2d3167 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_mnli_100_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_mnli_100 BertForSequenceClassification from tmnam20 +author: John Snow Labs +name: bert_base_multilingual_cased_mnli_100 +date: 2024-09-26 +tags: [xx, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_mnli_100` is a Multilingual model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_mnli_100_xx_5.5.0_3.0_1727336010337.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_mnli_100_xx_5.5.0_3.0_1727336010337.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_mnli_100","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_mnli_100", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_mnli_100| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/tmnam20/bert-base-multilingual-cased-mnli-100 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_mrpc_100_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_mrpc_100_pipeline_xx.md new file mode 100644 index 00000000000000..1d7cced294473e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_mrpc_100_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_mrpc_100_pipeline pipeline BertForSequenceClassification from tmnam20 +author: John Snow Labs +name: bert_base_multilingual_cased_mrpc_100_pipeline +date: 2024-09-26 +tags: [xx, open_source, pipeline, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_mrpc_100_pipeline` is a Multilingual model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_mrpc_100_pipeline_xx_5.5.0_3.0_1727322482869.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_mrpc_100_pipeline_xx_5.5.0_3.0_1727322482869.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_multilingual_cased_mrpc_100_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_multilingual_cased_mrpc_100_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_mrpc_100_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/tmnam20/bert-base-multilingual-cased-mrpc-100 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_mrpc_100_xx.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_mrpc_100_xx.md new file mode 100644 index 00000000000000..7fc1cb73956d74 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_mrpc_100_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_mrpc_100 BertForSequenceClassification from tmnam20 +author: John Snow Labs +name: bert_base_multilingual_cased_mrpc_100 +date: 2024-09-26 +tags: [xx, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_mrpc_100` is a Multilingual model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_mrpc_100_xx_5.5.0_3.0_1727322440387.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_mrpc_100_xx_5.5.0_3.0_1727322440387.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_mrpc_100","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_mrpc_100", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_mrpc_100| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/tmnam20/bert-base-multilingual-cased-mrpc-100 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_mrpc_10_xx.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_mrpc_10_xx.md new file mode 100644 index 00000000000000..954a3fddd2e4e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_mrpc_10_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_mrpc_10 BertForSequenceClassification from tmnam20 +author: John Snow Labs +name: bert_base_multilingual_cased_mrpc_10 +date: 2024-09-26 +tags: [xx, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_mrpc_10` is a Multilingual model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_mrpc_10_xx_5.5.0_3.0_1727319613732.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_mrpc_10_xx_5.5.0_3.0_1727319613732.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_mrpc_10","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_mrpc_10", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_mrpc_10| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/tmnam20/bert-base-multilingual-cased-mrpc-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_mrpc_1_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_mrpc_1_pipeline_xx.md new file mode 100644 index 00000000000000..6f34184eefbe8b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_mrpc_1_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_mrpc_1_pipeline pipeline BertForSequenceClassification from tmnam20 +author: John Snow Labs +name: bert_base_multilingual_cased_mrpc_1_pipeline +date: 2024-09-26 +tags: [xx, open_source, pipeline, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_mrpc_1_pipeline` is a Multilingual model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_mrpc_1_pipeline_xx_5.5.0_3.0_1727353634624.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_mrpc_1_pipeline_xx_5.5.0_3.0_1727353634624.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_multilingual_cased_mrpc_1_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_multilingual_cased_mrpc_1_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_mrpc_1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/tmnam20/bert-base-multilingual-cased-mrpc-1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_mrpc_1_xx.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_mrpc_1_xx.md new file mode 100644 index 00000000000000..6e9ce4e3321238 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_mrpc_1_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_mrpc_1 BertForSequenceClassification from tmnam20 +author: John Snow Labs +name: bert_base_multilingual_cased_mrpc_1 +date: 2024-09-26 +tags: [xx, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_mrpc_1` is a Multilingual model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_mrpc_1_xx_5.5.0_3.0_1727353593752.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_mrpc_1_xx_5.5.0_3.0_1727353593752.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_mrpc_1","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_mrpc_1", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_mrpc_1| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/tmnam20/bert-base-multilingual-cased-mrpc-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_qqp_100_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_qqp_100_pipeline_xx.md new file mode 100644 index 00000000000000..856e267d5626b5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_qqp_100_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_qqp_100_pipeline pipeline BertForSequenceClassification from tmnam20 +author: John Snow Labs +name: bert_base_multilingual_cased_qqp_100_pipeline +date: 2024-09-26 +tags: [xx, open_source, pipeline, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_qqp_100_pipeline` is a Multilingual model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_qqp_100_pipeline_xx_5.5.0_3.0_1727353872344.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_qqp_100_pipeline_xx_5.5.0_3.0_1727353872344.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_multilingual_cased_qqp_100_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_multilingual_cased_qqp_100_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_qqp_100_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/tmnam20/bert-base-multilingual-cased-qqp-100 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_qqp_100_xx.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_qqp_100_xx.md new file mode 100644 index 00000000000000..e6916630846c4c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_qqp_100_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_qqp_100 BertForSequenceClassification from tmnam20 +author: John Snow Labs +name: bert_base_multilingual_cased_qqp_100 +date: 2024-09-26 +tags: [xx, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_qqp_100` is a Multilingual model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_qqp_100_xx_5.5.0_3.0_1727353828321.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_qqp_100_xx_5.5.0_3.0_1727353828321.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_qqp_100","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_qqp_100", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_qqp_100| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/tmnam20/bert-base-multilingual-cased-qqp-100 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_rte_10_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_rte_10_pipeline_xx.md new file mode 100644 index 00000000000000..7514d3fa83741e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_rte_10_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_rte_10_pipeline pipeline BertForSequenceClassification from tmnam20 +author: John Snow Labs +name: bert_base_multilingual_cased_rte_10_pipeline +date: 2024-09-26 +tags: [xx, open_source, pipeline, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_rte_10_pipeline` is a Multilingual model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_rte_10_pipeline_xx_5.5.0_3.0_1727312618516.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_rte_10_pipeline_xx_5.5.0_3.0_1727312618516.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_multilingual_cased_rte_10_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_multilingual_cased_rte_10_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_rte_10_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/tmnam20/bert-base-multilingual-cased-rte-10 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_sst2_10_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_sst2_10_pipeline_xx.md new file mode 100644 index 00000000000000..52c95e25dd9798 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_sst2_10_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_sst2_10_pipeline pipeline BertForSequenceClassification from tmnam20 +author: John Snow Labs +name: bert_base_multilingual_cased_sst2_10_pipeline +date: 2024-09-26 +tags: [xx, open_source, pipeline, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_sst2_10_pipeline` is a Multilingual model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_sst2_10_pipeline_xx_5.5.0_3.0_1727312921465.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_sst2_10_pipeline_xx_5.5.0_3.0_1727312921465.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_multilingual_cased_sst2_10_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_multilingual_cased_sst2_10_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_sst2_10_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/tmnam20/bert-base-multilingual-cased-sst2-10 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_tiagosanti_xx.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_tiagosanti_xx.md new file mode 100644 index 00000000000000..a1ca6736fd613c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_tiagosanti_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_tiagosanti BertForSequenceClassification from TiagoSanti +author: John Snow Labs +name: bert_base_multilingual_cased_tiagosanti +date: 2024-09-26 +tags: [xx, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_tiagosanti` is a Multilingual model originally trained by TiagoSanti. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_tiagosanti_xx_5.5.0_3.0_1727353748292.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_tiagosanti_xx_5.5.0_3.0_1727353748292.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_tiagosanti","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_tiagosanti", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_tiagosanti| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/TiagoSanti/bert-base-multilingual-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_vsmec_1_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_vsmec_1_pipeline_xx.md new file mode 100644 index 00000000000000..5ba3051172d054 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_vsmec_1_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_vsmec_1_pipeline pipeline BertForSequenceClassification from tmnam20 +author: John Snow Labs +name: bert_base_multilingual_cased_vsmec_1_pipeline +date: 2024-09-26 +tags: [xx, open_source, pipeline, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_vsmec_1_pipeline` is a Multilingual model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_vsmec_1_pipeline_xx_5.5.0_3.0_1727333904473.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_vsmec_1_pipeline_xx_5.5.0_3.0_1727333904473.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_multilingual_cased_vsmec_1_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_multilingual_cased_vsmec_1_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_vsmec_1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/tmnam20/bert-base-multilingual-cased-vsmec-1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_vsmec_1_xx.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_vsmec_1_xx.md new file mode 100644 index 00000000000000..c3931591f715a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_cased_vsmec_1_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_vsmec_1 BertForSequenceClassification from tmnam20 +author: John Snow Labs +name: bert_base_multilingual_cased_vsmec_1 +date: 2024-09-26 +tags: [xx, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_vsmec_1` is a Multilingual model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_vsmec_1_xx_5.5.0_3.0_1727333861637.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_vsmec_1_xx_5.5.0_3.0_1727333861637.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_vsmec_1","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_vsmec_1", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_vsmec_1| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/tmnam20/bert-base-multilingual-cased-vsmec-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_uncased_sentiment_finetuned_meia_analisisdesentimientos_mfidabel_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_uncased_sentiment_finetuned_meia_analisisdesentimientos_mfidabel_pipeline_xx.md new file mode 100644 index 00000000000000..cfe07fd89ef558 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_uncased_sentiment_finetuned_meia_analisisdesentimientos_mfidabel_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_uncased_sentiment_finetuned_meia_analisisdesentimientos_mfidabel_pipeline pipeline BertForSequenceClassification from mfidabel +author: John Snow Labs +name: bert_base_multilingual_uncased_sentiment_finetuned_meia_analisisdesentimientos_mfidabel_pipeline +date: 2024-09-26 +tags: [xx, open_source, pipeline, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_uncased_sentiment_finetuned_meia_analisisdesentimientos_mfidabel_pipeline` is a Multilingual model originally trained by mfidabel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_uncased_sentiment_finetuned_meia_analisisdesentimientos_mfidabel_pipeline_xx_5.5.0_3.0_1727364790661.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_uncased_sentiment_finetuned_meia_analisisdesentimientos_mfidabel_pipeline_xx_5.5.0_3.0_1727364790661.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_multilingual_uncased_sentiment_finetuned_meia_analisisdesentimientos_mfidabel_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_multilingual_uncased_sentiment_finetuned_meia_analisisdesentimientos_mfidabel_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_uncased_sentiment_finetuned_meia_analisisdesentimientos_mfidabel_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|627.8 MB| + +## References + +https://huggingface.co/mfidabel/bert-base-multilingual-uncased-sentiment-finetuned-MeIA-AnalisisDeSentimientos + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_uncased_sentiment_meia_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_uncased_sentiment_meia_pipeline_xx.md new file mode 100644 index 00000000000000..2d782bd9bc84d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_uncased_sentiment_meia_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_uncased_sentiment_meia_pipeline pipeline BertForSequenceClassification from jyarac +author: John Snow Labs +name: bert_base_multilingual_uncased_sentiment_meia_pipeline +date: 2024-09-26 +tags: [xx, open_source, pipeline, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_uncased_sentiment_meia_pipeline` is a Multilingual model originally trained by jyarac. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_uncased_sentiment_meia_pipeline_xx_5.5.0_3.0_1727348546543.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_uncased_sentiment_meia_pipeline_xx_5.5.0_3.0_1727348546543.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_multilingual_uncased_sentiment_meia_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_multilingual_uncased_sentiment_meia_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_uncased_sentiment_meia_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|627.8 MB| + +## References + +https://huggingface.co/jyarac/bert-base-multilingual-uncased-sentiment-MeIA + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_uncased_sentiment_meia_xx.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_uncased_sentiment_meia_xx.md new file mode 100644 index 00000000000000..98c83287e7499e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_uncased_sentiment_meia_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_uncased_sentiment_meia BertForSequenceClassification from jyarac +author: John Snow Labs +name: bert_base_multilingual_uncased_sentiment_meia +date: 2024-09-26 +tags: [xx, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_uncased_sentiment_meia` is a Multilingual model originally trained by jyarac. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_uncased_sentiment_meia_xx_5.5.0_3.0_1727348514109.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_uncased_sentiment_meia_xx_5.5.0_3.0_1727348514109.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_uncased_sentiment_meia","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_uncased_sentiment_meia", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_uncased_sentiment_meia| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|627.7 MB| + +## References + +https://huggingface.co/jyarac/bert-base-multilingual-uncased-sentiment-MeIA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_uncased_sentiment_run2_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_uncased_sentiment_run2_pipeline_xx.md new file mode 100644 index 00000000000000..e34105c1f5f80d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_uncased_sentiment_run2_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_uncased_sentiment_run2_pipeline pipeline BertForSequenceClassification from gunkaynar +author: John Snow Labs +name: bert_base_multilingual_uncased_sentiment_run2_pipeline +date: 2024-09-26 +tags: [xx, open_source, pipeline, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_uncased_sentiment_run2_pipeline` is a Multilingual model originally trained by gunkaynar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_uncased_sentiment_run2_pipeline_xx_5.5.0_3.0_1727315057882.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_uncased_sentiment_run2_pipeline_xx_5.5.0_3.0_1727315057882.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_multilingual_uncased_sentiment_run2_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_multilingual_uncased_sentiment_run2_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_uncased_sentiment_run2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|627.8 MB| + +## References + +https://huggingface.co/gunkaynar/bert-base-multilingual-uncased-sentiment_run2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_uncased_sentiment_run2_xx.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_uncased_sentiment_run2_xx.md new file mode 100644 index 00000000000000..10c09676f118e1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multilingual_uncased_sentiment_run2_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_uncased_sentiment_run2 BertForSequenceClassification from gunkaynar +author: John Snow Labs +name: bert_base_multilingual_uncased_sentiment_run2 +date: 2024-09-26 +tags: [xx, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_uncased_sentiment_run2` is a Multilingual model originally trained by gunkaynar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_uncased_sentiment_run2_xx_5.5.0_3.0_1727315026092.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_uncased_sentiment_run2_xx_5.5.0_3.0_1727315026092.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_uncased_sentiment_run2","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_uncased_sentiment_run2", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_uncased_sentiment_run2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|627.8 MB| + +## References + +https://huggingface.co/gunkaynar/bert-base-multilingual-uncased-sentiment_run2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_multinli_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multinli_en.md new file mode 100644 index 00000000000000..8ae11c872033c3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_multinli_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_multinli BertForSequenceClassification from nouf-sst +author: John Snow Labs +name: bert_base_multinli +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multinli` is a English model originally trained by nouf-sst. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multinli_en_5.5.0_3.0_1727336198812.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multinli_en_5.5.0_3.0_1727336198812.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multinli","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multinli", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multinli| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/nouf-sst/bert-base-MultiNLI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_portuguese_cased_02_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_portuguese_cased_02_en.md new file mode 100644 index 00000000000000..4ef60943e1ef82 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_portuguese_cased_02_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_portuguese_cased_02 BertForSequenceClassification from TiagoSanti +author: John Snow Labs +name: bert_base_portuguese_cased_02 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_portuguese_cased_02` is a English model originally trained by TiagoSanti. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_portuguese_cased_02_en_5.5.0_3.0_1727336305186.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_portuguese_cased_02_en_5.5.0_3.0_1727336305186.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_portuguese_cased_02","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_portuguese_cased_02", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_portuguese_cased_02| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/TiagoSanti/bert-base-portuguese-cased-02 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_portuguese_cased_assin2_entailment_pt.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_portuguese_cased_assin2_entailment_pt.md new file mode 100644 index 00000000000000..5a8388d34c0fbc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_portuguese_cased_assin2_entailment_pt.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Portuguese bert_base_portuguese_cased_assin2_entailment BertForSequenceClassification from ruanchaves +author: John Snow Labs +name: bert_base_portuguese_cased_assin2_entailment +date: 2024-09-26 +tags: [pt, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: pt +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_portuguese_cased_assin2_entailment` is a Portuguese model originally trained by ruanchaves. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_portuguese_cased_assin2_entailment_pt_5.5.0_3.0_1727334220858.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_portuguese_cased_assin2_entailment_pt_5.5.0_3.0_1727334220858.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_portuguese_cased_assin2_entailment","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_portuguese_cased_assin2_entailment", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_portuguese_cased_assin2_entailment| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|pt| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ruanchaves/bert-base-portuguese-cased-assin2-entailment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_portuguese_cased_assin_entailment_pipeline_pt.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_portuguese_cased_assin_entailment_pipeline_pt.md new file mode 100644 index 00000000000000..9e5eb72f460d28 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_portuguese_cased_assin_entailment_pipeline_pt.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Portuguese bert_base_portuguese_cased_assin_entailment_pipeline pipeline BertForSequenceClassification from ruanchaves +author: John Snow Labs +name: bert_base_portuguese_cased_assin_entailment_pipeline +date: 2024-09-26 +tags: [pt, open_source, pipeline, onnx] +task: Text Classification +language: pt +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_portuguese_cased_assin_entailment_pipeline` is a Portuguese model originally trained by ruanchaves. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_portuguese_cased_assin_entailment_pipeline_pt_5.5.0_3.0_1727352012820.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_portuguese_cased_assin_entailment_pipeline_pt_5.5.0_3.0_1727352012820.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_portuguese_cased_assin_entailment_pipeline", lang = "pt") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_portuguese_cased_assin_entailment_pipeline", lang = "pt") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_portuguese_cased_assin_entailment_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|pt| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ruanchaves/bert-base-portuguese-cased-assin-entailment + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_portuguese_cased_assin_entailment_pt.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_portuguese_cased_assin_entailment_pt.md new file mode 100644 index 00000000000000..3c80a475a1d6eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_portuguese_cased_assin_entailment_pt.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Portuguese bert_base_portuguese_cased_assin_entailment BertForSequenceClassification from ruanchaves +author: John Snow Labs +name: bert_base_portuguese_cased_assin_entailment +date: 2024-09-26 +tags: [pt, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: pt +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_portuguese_cased_assin_entailment` is a Portuguese model originally trained by ruanchaves. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_portuguese_cased_assin_entailment_pt_5.5.0_3.0_1727351991230.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_portuguese_cased_assin_entailment_pt_5.5.0_3.0_1727351991230.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_portuguese_cased_assin_entailment","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_portuguese_cased_assin_entailment", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_portuguese_cased_assin_entailment| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|pt| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ruanchaves/bert-base-portuguese-cased-assin-entailment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_portuguese_cased_leandroaraujodev_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_portuguese_cased_leandroaraujodev_en.md new file mode 100644 index 00000000000000..a2fe88bd0ce460 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_portuguese_cased_leandroaraujodev_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_portuguese_cased_leandroaraujodev BertForSequenceClassification from leandroaraujodev +author: John Snow Labs +name: bert_base_portuguese_cased_leandroaraujodev +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_portuguese_cased_leandroaraujodev` is a English model originally trained by leandroaraujodev. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_portuguese_cased_leandroaraujodev_en_5.5.0_3.0_1727328338239.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_portuguese_cased_leandroaraujodev_en_5.5.0_3.0_1727328338239.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_portuguese_cased_leandroaraujodev","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_portuguese_cased_leandroaraujodev", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_portuguese_cased_leandroaraujodev| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/leandroaraujodev/bert-base-portuguese-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_portuguese_cased_leandroaraujodev_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_portuguese_cased_leandroaraujodev_pipeline_en.md new file mode 100644 index 00000000000000..7a48b35648e960 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_portuguese_cased_leandroaraujodev_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_portuguese_cased_leandroaraujodev_pipeline pipeline BertForSequenceClassification from leandroaraujodev +author: John Snow Labs +name: bert_base_portuguese_cased_leandroaraujodev_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_portuguese_cased_leandroaraujodev_pipeline` is a English model originally trained by leandroaraujodev. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_portuguese_cased_leandroaraujodev_pipeline_en_5.5.0_3.0_1727328359061.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_portuguese_cased_leandroaraujodev_pipeline_en_5.5.0_3.0_1727328359061.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_portuguese_cased_leandroaraujodev_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_portuguese_cased_leandroaraujodev_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_portuguese_cased_leandroaraujodev_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/leandroaraujodev/bert-base-portuguese-cased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_sanskrit_saskta_tweets_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_sanskrit_saskta_tweets_en.md new file mode 100644 index 00000000000000..ea8d6b12e7c7d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_sanskrit_saskta_tweets_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_sanskrit_saskta_tweets BertForSequenceClassification from ricardo-filho +author: John Snow Labs +name: bert_base_sanskrit_saskta_tweets +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_sanskrit_saskta_tweets` is a English model originally trained by ricardo-filho. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_sanskrit_saskta_tweets_en_5.5.0_3.0_1727317285412.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_sanskrit_saskta_tweets_en_5.5.0_3.0_1727317285412.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_sanskrit_saskta_tweets","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_sanskrit_saskta_tweets", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_sanskrit_saskta_tweets| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ricardo-filho/bert-base-sa-tweets \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_spanish_wwm_cased_caresc_es.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_spanish_wwm_cased_caresc_es.md new file mode 100644 index 00000000000000..2b9103f5993fda --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_spanish_wwm_cased_caresc_es.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Castilian, Spanish bert_base_spanish_wwm_cased_caresc BertForSequenceClassification from IIC +author: John Snow Labs +name: bert_base_spanish_wwm_cased_caresc +date: 2024-09-26 +tags: [es, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: es +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_spanish_wwm_cased_caresc` is a Castilian, Spanish model originally trained by IIC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_cased_caresc_es_5.5.0_3.0_1727332136098.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_cased_caresc_es_5.5.0_3.0_1727332136098.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_spanish_wwm_cased_caresc","es") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_spanish_wwm_cased_caresc", "es") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_spanish_wwm_cased_caresc| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|411.8 MB| + +## References + +https://huggingface.co/IIC/bert-base-spanish-wwm-cased-caresC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_spanish_wwm_cased_finetuned_meia_analisisdesentimientos_allman_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_spanish_wwm_cased_finetuned_meia_analisisdesentimientos_allman_pipeline_en.md new file mode 100644 index 00000000000000..8785fdadfe6484 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_spanish_wwm_cased_finetuned_meia_analisisdesentimientos_allman_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_spanish_wwm_cased_finetuned_meia_analisisdesentimientos_allman_pipeline pipeline BertForSequenceClassification from allman +author: John Snow Labs +name: bert_base_spanish_wwm_cased_finetuned_meia_analisisdesentimientos_allman_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_spanish_wwm_cased_finetuned_meia_analisisdesentimientos_allman_pipeline` is a English model originally trained by allman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_cased_finetuned_meia_analisisdesentimientos_allman_pipeline_en_5.5.0_3.0_1727314045152.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_cased_finetuned_meia_analisisdesentimientos_allman_pipeline_en_5.5.0_3.0_1727314045152.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_spanish_wwm_cased_finetuned_meia_analisisdesentimientos_allman_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_spanish_wwm_cased_finetuned_meia_analisisdesentimientos_allman_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_spanish_wwm_cased_finetuned_meia_analisisdesentimientos_allman_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|411.7 MB| + +## References + +https://huggingface.co/allman/bert-base-spanish-wwm-cased-finetuned-MeIA-AnalisisDeSentimientos + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_spanish_wwm_cased_finetuned_nlp_ie_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_spanish_wwm_cased_finetuned_nlp_ie_pipeline_en.md new file mode 100644 index 00000000000000..b3684866887bc7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_spanish_wwm_cased_finetuned_nlp_ie_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_spanish_wwm_cased_finetuned_nlp_ie_pipeline pipeline BertForSequenceClassification from Willy +author: John Snow Labs +name: bert_base_spanish_wwm_cased_finetuned_nlp_ie_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_spanish_wwm_cased_finetuned_nlp_ie_pipeline` is a English model originally trained by Willy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_cased_finetuned_nlp_ie_pipeline_en_5.5.0_3.0_1727338926935.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_cased_finetuned_nlp_ie_pipeline_en_5.5.0_3.0_1727338926935.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_spanish_wwm_cased_finetuned_nlp_ie_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_spanish_wwm_cased_finetuned_nlp_ie_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_spanish_wwm_cased_finetuned_nlp_ie_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|411.7 MB| + +## References + +https://huggingface.co/Willy/bert-base-spanish-wwm-cased-finetuned-NLP-IE + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_spanish_wwm_cased_k1_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_spanish_wwm_cased_k1_en.md new file mode 100644 index 00000000000000..a3112bc1a68a40 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_spanish_wwm_cased_k1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_spanish_wwm_cased_k1 BertForSequenceClassification from dtorber +author: John Snow Labs +name: bert_base_spanish_wwm_cased_k1 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_spanish_wwm_cased_k1` is a English model originally trained by dtorber. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_cased_k1_en_5.5.0_3.0_1727349431440.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_cased_k1_en_5.5.0_3.0_1727349431440.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_spanish_wwm_cased_k1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_spanish_wwm_cased_k1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_spanish_wwm_cased_k1| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|411.7 MB| + +## References + +https://huggingface.co/dtorber/bert-base-spanish-wwm-cased_K1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_spanish_wwm_cased_k1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_spanish_wwm_cased_k1_pipeline_en.md new file mode 100644 index 00000000000000..d1e0958ab4c618 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_spanish_wwm_cased_k1_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_spanish_wwm_cased_k1_pipeline pipeline BertForSequenceClassification from dtorber +author: John Snow Labs +name: bert_base_spanish_wwm_cased_k1_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_spanish_wwm_cased_k1_pipeline` is a English model originally trained by dtorber. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_cased_k1_pipeline_en_5.5.0_3.0_1727349457323.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_cased_k1_pipeline_en_5.5.0_3.0_1727349457323.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_spanish_wwm_cased_k1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_spanish_wwm_cased_k1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_spanish_wwm_cased_k1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|411.7 MB| + +## References + +https://huggingface.co/dtorber/bert-base-spanish-wwm-cased_K1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_spanish_wwm_uncased_finetuned_meia_analisisdesentimientos_sandydelmar_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_spanish_wwm_uncased_finetuned_meia_analisisdesentimientos_sandydelmar_en.md new file mode 100644 index 00000000000000..c284ca2d88d33d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_spanish_wwm_uncased_finetuned_meia_analisisdesentimientos_sandydelmar_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_spanish_wwm_uncased_finetuned_meia_analisisdesentimientos_sandydelmar BertForSequenceClassification from SandyDelMar +author: John Snow Labs +name: bert_base_spanish_wwm_uncased_finetuned_meia_analisisdesentimientos_sandydelmar +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_spanish_wwm_uncased_finetuned_meia_analisisdesentimientos_sandydelmar` is a English model originally trained by SandyDelMar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_uncased_finetuned_meia_analisisdesentimientos_sandydelmar_en_5.5.0_3.0_1727352342415.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_uncased_finetuned_meia_analisisdesentimientos_sandydelmar_en_5.5.0_3.0_1727352342415.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_spanish_wwm_uncased_finetuned_meia_analisisdesentimientos_sandydelmar","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_spanish_wwm_uncased_finetuned_meia_analisisdesentimientos_sandydelmar", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_spanish_wwm_uncased_finetuned_meia_analisisdesentimientos_sandydelmar| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|411.9 MB| + +## References + +https://huggingface.co/SandyDelMar/bert-base-spanish-wwm-uncased-finetuned-MeIA-AnalisisDeSentimientos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_spanish_wwm_uncased_finetuned_meia_analisisdesentimientos_sandydelmar_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_spanish_wwm_uncased_finetuned_meia_analisisdesentimientos_sandydelmar_pipeline_en.md new file mode 100644 index 00000000000000..cd41744d2912d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_spanish_wwm_uncased_finetuned_meia_analisisdesentimientos_sandydelmar_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_spanish_wwm_uncased_finetuned_meia_analisisdesentimientos_sandydelmar_pipeline pipeline BertForSequenceClassification from SandyDelMar +author: John Snow Labs +name: bert_base_spanish_wwm_uncased_finetuned_meia_analisisdesentimientos_sandydelmar_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_spanish_wwm_uncased_finetuned_meia_analisisdesentimientos_sandydelmar_pipeline` is a English model originally trained by SandyDelMar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_uncased_finetuned_meia_analisisdesentimientos_sandydelmar_pipeline_en_5.5.0_3.0_1727352363906.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_uncased_finetuned_meia_analisisdesentimientos_sandydelmar_pipeline_en_5.5.0_3.0_1727352363906.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_spanish_wwm_uncased_finetuned_meia_analisisdesentimientos_sandydelmar_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_spanish_wwm_uncased_finetuned_meia_analisisdesentimientos_sandydelmar_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_spanish_wwm_uncased_finetuned_meia_analisisdesentimientos_sandydelmar_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|411.9 MB| + +## References + +https://huggingface.co/SandyDelMar/bert-base-spanish-wwm-uncased-finetuned-MeIA-AnalisisDeSentimientos + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_spanish_wwm_uncased_r_tag_0_3_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_spanish_wwm_uncased_r_tag_0_3_pipeline_en.md new file mode 100644 index 00000000000000..2093118cd82b2e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_spanish_wwm_uncased_r_tag_0_3_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_spanish_wwm_uncased_r_tag_0_3_pipeline pipeline BertForSequenceClassification from ISA-Group +author: John Snow Labs +name: bert_base_spanish_wwm_uncased_r_tag_0_3_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_spanish_wwm_uncased_r_tag_0_3_pipeline` is a English model originally trained by ISA-Group. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_uncased_r_tag_0_3_pipeline_en_5.5.0_3.0_1727348824982.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_uncased_r_tag_0_3_pipeline_en_5.5.0_3.0_1727348824982.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_spanish_wwm_uncased_r_tag_0_3_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_spanish_wwm_uncased_r_tag_0_3_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_spanish_wwm_uncased_r_tag_0_3_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|411.9 MB| + +## References + +https://huggingface.co/ISA-Group/bert-base-spanish-wwm-uncased_r-tag-0.3 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_turkish_finetuned_nli_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_turkish_finetuned_nli_en.md new file mode 100644 index 00000000000000..ef6ad5b7e63896 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_turkish_finetuned_nli_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_turkish_finetuned_nli BertForSequenceClassification from aniltepe +author: John Snow Labs +name: bert_base_turkish_finetuned_nli +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_turkish_finetuned_nli` is a English model originally trained by aniltepe. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_turkish_finetuned_nli_en_5.5.0_3.0_1727309893756.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_turkish_finetuned_nli_en_5.5.0_3.0_1727309893756.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_turkish_finetuned_nli","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_turkish_finetuned_nli", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_turkish_finetuned_nli| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.5 MB| + +## References + +https://huggingface.co/aniltepe/bert-base-turkish-finetuned-nli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_turkish_finetuned_nli_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_turkish_finetuned_nli_pipeline_en.md new file mode 100644 index 00000000000000..cc6255b1c34cab --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_turkish_finetuned_nli_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_turkish_finetuned_nli_pipeline pipeline BertForSequenceClassification from aniltepe +author: John Snow Labs +name: bert_base_turkish_finetuned_nli_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_turkish_finetuned_nli_pipeline` is a English model originally trained by aniltepe. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_turkish_finetuned_nli_pipeline_en_5.5.0_3.0_1727309929598.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_turkish_finetuned_nli_pipeline_en_5.5.0_3.0_1727309929598.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_turkish_finetuned_nli_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_turkish_finetuned_nli_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_turkish_finetuned_nli_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|414.5 MB| + +## References + +https://huggingface.co/aniltepe/bert-base-turkish-finetuned-nli + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_10k_vulgarity_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_10k_vulgarity_en.md new file mode 100644 index 00000000000000..4e4aadeeb1945d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_10k_vulgarity_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_10k_vulgarity BertForSequenceClassification from AbhishekkV19 +author: John Snow Labs +name: bert_base_uncased_10k_vulgarity +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_10k_vulgarity` is a English model originally trained by AbhishekkV19. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_10k_vulgarity_en_5.5.0_3.0_1727364625763.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_10k_vulgarity_en_5.5.0_3.0_1727364625763.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_10k_vulgarity","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_10k_vulgarity", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_10k_vulgarity| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/AbhishekkV19/bert-base-uncased-10k-vulgarity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_10k_vulgarity_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_10k_vulgarity_pipeline_en.md new file mode 100644 index 00000000000000..e4234e9a6c051f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_10k_vulgarity_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_10k_vulgarity_pipeline pipeline BertForSequenceClassification from AbhishekkV19 +author: John Snow Labs +name: bert_base_uncased_10k_vulgarity_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_10k_vulgarity_pipeline` is a English model originally trained by AbhishekkV19. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_10k_vulgarity_pipeline_en_5.5.0_3.0_1727364646961.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_10k_vulgarity_pipeline_en_5.5.0_3.0_1727364646961.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_10k_vulgarity_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_10k_vulgarity_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_10k_vulgarity_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/AbhishekkV19/bert-base-uncased-10k-vulgarity + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_ag_news_finetuned_2_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_ag_news_finetuned_2_en.md new file mode 100644 index 00000000000000..a08d1c3ebe57a7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_ag_news_finetuned_2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_ag_news_finetuned_2 BertForSequenceClassification from odunola +author: John Snow Labs +name: bert_base_uncased_ag_news_finetuned_2 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_ag_news_finetuned_2` is a English model originally trained by odunola. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_ag_news_finetuned_2_en_5.5.0_3.0_1727366166248.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_ag_news_finetuned_2_en_5.5.0_3.0_1727366166248.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_ag_news_finetuned_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_ag_news_finetuned_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_ag_news_finetuned_2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/odunola/bert-base-uncased-ag-news-finetuned-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_agnews_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_agnews_en.md new file mode 100644 index 00000000000000..12f70921bfcca5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_agnews_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_agnews BertForSequenceClassification from tamhuynh27 +author: John Snow Labs +name: bert_base_uncased_agnews +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_agnews` is a English model originally trained by tamhuynh27. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_agnews_en_5.5.0_3.0_1727318780806.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_agnews_en_5.5.0_3.0_1727318780806.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_agnews","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_agnews", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_agnews| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/tamhuynh27/bert-base-uncased-agnews \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_autext_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_autext_en.md new file mode 100644 index 00000000000000..67af135c8382c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_autext_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_autext BertForSequenceClassification from jorgefg03 +author: John Snow Labs +name: bert_base_uncased_autext +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_autext` is a English model originally trained by jorgefg03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_autext_en_5.5.0_3.0_1727314846763.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_autext_en_5.5.0_3.0_1727314846763.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_autext","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_autext", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_autext| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jorgefg03/bert-base-uncased-autext \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_autext_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_autext_pipeline_en.md new file mode 100644 index 00000000000000..a79695347db9c9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_autext_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_autext_pipeline pipeline BertForSequenceClassification from jorgefg03 +author: John Snow Labs +name: bert_base_uncased_autext_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_autext_pipeline` is a English model originally trained by jorgefg03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_autext_pipeline_en_5.5.0_3.0_1727314868837.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_autext_pipeline_en_5.5.0_3.0_1727314868837.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_autext_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_autext_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_autext_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jorgefg03/bert-base-uncased-autext + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_boolq_howey_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_boolq_howey_en.md new file mode 100644 index 00000000000000..1bf238d9df2f6c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_boolq_howey_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_boolq_howey BertForSequenceClassification from howey +author: John Snow Labs +name: bert_base_uncased_boolq_howey +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_boolq_howey` is a English model originally trained by howey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_boolq_howey_en_5.5.0_3.0_1727369814839.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_boolq_howey_en_5.5.0_3.0_1727369814839.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_boolq_howey","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_boolq_howey", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_boolq_howey| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/howey/bert-base-uncased-boolq \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_boolq_howey_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_boolq_howey_pipeline_en.md new file mode 100644 index 00000000000000..cb3189e45d895f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_boolq_howey_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_boolq_howey_pipeline pipeline BertForSequenceClassification from howey +author: John Snow Labs +name: bert_base_uncased_boolq_howey_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_boolq_howey_pipeline` is a English model originally trained by howey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_boolq_howey_pipeline_en_5.5.0_3.0_1727369835935.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_boolq_howey_pipeline_en_5.5.0_3.0_1727369835935.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_boolq_howey_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_boolq_howey_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_boolq_howey_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/howey/bert-base-uncased-boolq + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_cola_epochs_10_lr_5e_05_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_cola_epochs_10_lr_5e_05_pipeline_en.md new file mode 100644 index 00000000000000..5c5e1df77bcd5c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_cola_epochs_10_lr_5e_05_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_cola_epochs_10_lr_5e_05_pipeline pipeline BertForSequenceClassification from prateeky2806 +author: John Snow Labs +name: bert_base_uncased_cola_epochs_10_lr_5e_05_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_cola_epochs_10_lr_5e_05_pipeline` is a English model originally trained by prateeky2806. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_cola_epochs_10_lr_5e_05_pipeline_en_5.5.0_3.0_1727335035488.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_cola_epochs_10_lr_5e_05_pipeline_en_5.5.0_3.0_1727335035488.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_cola_epochs_10_lr_5e_05_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_cola_epochs_10_lr_5e_05_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_cola_epochs_10_lr_5e_05_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/prateeky2806/bert-base-uncased-cola-epochs-10-lr-5e-05 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_cola_finetuned_cola_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_cola_finetuned_cola_en.md new file mode 100644 index 00000000000000..c24538c9890e8e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_cola_finetuned_cola_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_cola_finetuned_cola BertForSequenceClassification from kapilchauhan +author: John Snow Labs +name: bert_base_uncased_cola_finetuned_cola +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_cola_finetuned_cola` is a English model originally trained by kapilchauhan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_cola_finetuned_cola_en_5.5.0_3.0_1727355679206.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_cola_finetuned_cola_en_5.5.0_3.0_1727355679206.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_cola_finetuned_cola","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_cola_finetuned_cola", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_cola_finetuned_cola| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/kapilchauhan/bert-base-uncased-CoLA-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_cola_finetuned_cola_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_cola_finetuned_cola_pipeline_en.md new file mode 100644 index 00000000000000..b8cb82742cce02 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_cola_finetuned_cola_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_cola_finetuned_cola_pipeline pipeline BertForSequenceClassification from kapilchauhan +author: John Snow Labs +name: bert_base_uncased_cola_finetuned_cola_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_cola_finetuned_cola_pipeline` is a English model originally trained by kapilchauhan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_cola_finetuned_cola_pipeline_en_5.5.0_3.0_1727355703757.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_cola_finetuned_cola_pipeline_en_5.5.0_3.0_1727355703757.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_cola_finetuned_cola_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_cola_finetuned_cola_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_cola_finetuned_cola_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/kapilchauhan/bert-base-uncased-CoLA-finetuned-cola + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_content_zeroshot_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_content_zeroshot_en.md new file mode 100644 index 00000000000000..b554aefd923d3f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_content_zeroshot_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_content_zeroshot BertForSequenceClassification from kaanakdeniz +author: John Snow Labs +name: bert_base_uncased_content_zeroshot +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_content_zeroshot` is a English model originally trained by kaanakdeniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_content_zeroshot_en_5.5.0_3.0_1727348677074.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_content_zeroshot_en_5.5.0_3.0_1727348677074.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_content_zeroshot","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_content_zeroshot", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_content_zeroshot| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/kaanakdeniz/bert_base_uncased_content_zeroshot \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_content_zeroshot_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_content_zeroshot_pipeline_en.md new file mode 100644 index 00000000000000..ebff6d6d268874 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_content_zeroshot_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_content_zeroshot_pipeline pipeline BertForSequenceClassification from kaanakdeniz +author: John Snow Labs +name: bert_base_uncased_content_zeroshot_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_content_zeroshot_pipeline` is a English model originally trained by kaanakdeniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_content_zeroshot_pipeline_en_5.5.0_3.0_1727348698654.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_content_zeroshot_pipeline_en_5.5.0_3.0_1727348698654.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_content_zeroshot_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_content_zeroshot_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_content_zeroshot_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/kaanakdeniz/bert_base_uncased_content_zeroshot + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_detect_depression_stage_one_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_detect_depression_stage_one_en.md new file mode 100644 index 00000000000000..041a498f7e6633 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_detect_depression_stage_one_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_detect_depression_stage_one BertForSequenceClassification from hoanghoavienvo +author: John Snow Labs +name: bert_base_uncased_detect_depression_stage_one +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_detect_depression_stage_one` is a English model originally trained by hoanghoavienvo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_detect_depression_stage_one_en_5.5.0_3.0_1727321376493.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_detect_depression_stage_one_en_5.5.0_3.0_1727321376493.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_detect_depression_stage_one","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_detect_depression_stage_one", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_detect_depression_stage_one| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/hoanghoavienvo/bert-base-uncased-detect-depression-stage-one \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_dstc10_knowledge_cluster_classifier_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_dstc10_knowledge_cluster_classifier_en.md new file mode 100644 index 00000000000000..11d1674c3739f8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_dstc10_knowledge_cluster_classifier_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_dstc10_knowledge_cluster_classifier BertForSequenceClassification from wilsontam +author: John Snow Labs +name: bert_base_uncased_dstc10_knowledge_cluster_classifier +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_dstc10_knowledge_cluster_classifier` is a English model originally trained by wilsontam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_dstc10_knowledge_cluster_classifier_en_5.5.0_3.0_1727325942451.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_dstc10_knowledge_cluster_classifier_en_5.5.0_3.0_1727325942451.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_dstc10_knowledge_cluster_classifier","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_dstc10_knowledge_cluster_classifier", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_dstc10_knowledge_cluster_classifier| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.5 MB| + +## References + +https://huggingface.co/wilsontam/bert-base-uncased-dstc10-knowledge-cluster-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_dummy_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_dummy_pipeline_en.md new file mode 100644 index 00000000000000..e0c2f195b5234f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_dummy_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_dummy_pipeline pipeline BertForSequenceClassification from stefanbschneider +author: John Snow Labs +name: bert_base_uncased_dummy_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_dummy_pipeline` is a English model originally trained by stefanbschneider. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_dummy_pipeline_en_5.5.0_3.0_1727346950455.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_dummy_pipeline_en_5.5.0_3.0_1727346950455.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_dummy_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_dummy_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_dummy_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/stefanbschneider/bert-base-uncased-dummy + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_emotion_mooncrescent_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_emotion_mooncrescent_en.md new file mode 100644 index 00000000000000..7204a4668f29b5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_emotion_mooncrescent_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_emotion_mooncrescent BertForSequenceClassification from MoonCrescent +author: John Snow Labs +name: bert_base_uncased_emotion_mooncrescent +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_emotion_mooncrescent` is a English model originally trained by MoonCrescent. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_emotion_mooncrescent_en_5.5.0_3.0_1727338421404.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_emotion_mooncrescent_en_5.5.0_3.0_1727338421404.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_emotion_mooncrescent","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_emotion_mooncrescent", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_emotion_mooncrescent| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/MoonCrescent/bert-base-uncased-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_eurlex_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_eurlex_en.md new file mode 100644 index 00000000000000..1fcfd44e7b5914 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_eurlex_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_eurlex BertEmbeddings from nlpaueb +author: John Snow Labs +name: bert_base_uncased_eurlex +date: 2024-09-26 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_eurlex` is a English model originally trained by nlpaueb. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_eurlex_en_5.5.0_3.0_1727338395802.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_eurlex_en_5.5.0_3.0_1727338395802.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_eurlex","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) +``` +```scala +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_eurlex", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_eurlex| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +References + +References + +https://huggingface.co/nlpaueb/bert-base-uncased-eurlex \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_eurlex_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_eurlex_pipeline_en.md new file mode 100644 index 00000000000000..29fc24eb53b416 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_eurlex_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_eurlex_pipeline pipeline BertForSequenceClassification from Kamer +author: John Snow Labs +name: bert_base_uncased_eurlex_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_eurlex_pipeline` is a English model originally trained by Kamer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_eurlex_pipeline_en_5.5.0_3.0_1727338423157.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_eurlex_pipeline_en_5.5.0_3.0_1727338423157.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_eurlex_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_eurlex_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_eurlex_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Kamer/bert-base-uncased-eurlex + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_fake_news_classification_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_fake_news_classification_pipeline_en.md new file mode 100644 index 00000000000000..7de81ff2d8cfbf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_fake_news_classification_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_fake_news_classification_pipeline pipeline BertForSequenceClassification from MScDS2023 +author: John Snow Labs +name: bert_base_uncased_fake_news_classification_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_fake_news_classification_pipeline` is a English model originally trained by MScDS2023. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_fake_news_classification_pipeline_en_5.5.0_3.0_1727347884734.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_fake_news_classification_pipeline_en_5.5.0_3.0_1727347884734.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_fake_news_classification_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_fake_news_classification_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_fake_news_classification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/MScDS2023/bert-base-uncased-fake-news-classification + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_fever_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_fever_en.md new file mode 100644 index 00000000000000..83d4478cb09c1a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_fever_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_fever BertForSequenceClassification from sagnikrayc +author: John Snow Labs +name: bert_base_uncased_fever +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_fever` is a English model originally trained by sagnikrayc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_fever_en_5.5.0_3.0_1727349432547.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_fever_en_5.5.0_3.0_1727349432547.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_fever","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_fever", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_fever| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/sagnikrayc/bert-base-uncased-fever \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_fine_tuned_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_fine_tuned_en.md new file mode 100644 index 00000000000000..bc0c7f5dd0c2e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_fine_tuned_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_fine_tuned BertForSequenceClassification from Lecs +author: John Snow Labs +name: bert_base_uncased_fine_tuned +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_fine_tuned` is a English model originally trained by Lecs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_fine_tuned_en_5.5.0_3.0_1727311658780.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_fine_tuned_en_5.5.0_3.0_1727311658780.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_fine_tuned","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_fine_tuned", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_fine_tuned| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Lecs/bert-base-uncased-fine-tuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_fine_tuned_parler_data_norwegian_bart_norwegian_t5_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_fine_tuned_parler_data_norwegian_bart_norwegian_t5_en.md new file mode 100644 index 00000000000000..7a127a25a2241a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_fine_tuned_parler_data_norwegian_bart_norwegian_t5_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_fine_tuned_parler_data_norwegian_bart_norwegian_t5 BertForSequenceClassification from adutlersaar +author: John Snow Labs +name: bert_base_uncased_fine_tuned_parler_data_norwegian_bart_norwegian_t5 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_fine_tuned_parler_data_norwegian_bart_norwegian_t5` is a English model originally trained by adutlersaar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_fine_tuned_parler_data_norwegian_bart_norwegian_t5_en_5.5.0_3.0_1727349802636.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_fine_tuned_parler_data_norwegian_bart_norwegian_t5_en_5.5.0_3.0_1727349802636.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_fine_tuned_parler_data_norwegian_bart_norwegian_t5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_fine_tuned_parler_data_norwegian_bart_norwegian_t5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_fine_tuned_parler_data_norwegian_bart_norwegian_t5| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/adutlersaar/bert-base-uncased-fine-tuned-parler_data-no_bart-no_t5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_fine_tuned_parler_data_with_bart_with_t5_norwegian_bpe_adv_trained_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_fine_tuned_parler_data_with_bart_with_t5_norwegian_bpe_adv_trained_en.md new file mode 100644 index 00000000000000..813d782379b586 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_fine_tuned_parler_data_with_bart_with_t5_norwegian_bpe_adv_trained_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_fine_tuned_parler_data_with_bart_with_t5_norwegian_bpe_adv_trained BertForSequenceClassification from adutlersaar +author: John Snow Labs +name: bert_base_uncased_fine_tuned_parler_data_with_bart_with_t5_norwegian_bpe_adv_trained +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_fine_tuned_parler_data_with_bart_with_t5_norwegian_bpe_adv_trained` is a English model originally trained by adutlersaar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_fine_tuned_parler_data_with_bart_with_t5_norwegian_bpe_adv_trained_en_5.5.0_3.0_1727320308338.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_fine_tuned_parler_data_with_bart_with_t5_norwegian_bpe_adv_trained_en_5.5.0_3.0_1727320308338.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_fine_tuned_parler_data_with_bart_with_t5_norwegian_bpe_adv_trained","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_fine_tuned_parler_data_with_bart_with_t5_norwegian_bpe_adv_trained", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_fine_tuned_parler_data_with_bart_with_t5_norwegian_bpe_adv_trained| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/adutlersaar/bert-base-uncased-fine-tuned-parler_data-with_bart-with_t5-no_bpe-adv-trained \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_10epoch_num200_450_405cls_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_10epoch_num200_450_405cls_pipeline_en.md new file mode 100644 index 00000000000000..1b3f7c172fde70 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_10epoch_num200_450_405cls_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_10epoch_num200_450_405cls_pipeline pipeline BertForSequenceClassification from liangyuant +author: John Snow Labs +name: bert_base_uncased_finetuned_10epoch_num200_450_405cls_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_10epoch_num200_450_405cls_pipeline` is a English model originally trained by liangyuant. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_10epoch_num200_450_405cls_pipeline_en_5.5.0_3.0_1727350804885.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_10epoch_num200_450_405cls_pipeline_en_5.5.0_3.0_1727350804885.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_10epoch_num200_450_405cls_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_10epoch_num200_450_405cls_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_10epoch_num200_450_405cls_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|410.6 MB| + +## References + +https://huggingface.co/liangyuant/bert-base-uncased-finetuned-10epoch-num200-450-405cls + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_argument_detection_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_argument_detection_en.md new file mode 100644 index 00000000000000..8b0cc863f795a8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_argument_detection_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_argument_detection BertForSequenceClassification from vzty +author: John Snow Labs +name: bert_base_uncased_finetuned_argument_detection +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_argument_detection` is a English model originally trained by vzty. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_argument_detection_en_5.5.0_3.0_1727350095507.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_argument_detection_en_5.5.0_3.0_1727350095507.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_argument_detection","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_argument_detection", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_argument_detection| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/vzty/bert-base-uncased-finetuned-argument-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_clef_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_clef_pipeline_en.md new file mode 100644 index 00000000000000..b43a739931bae5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_clef_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_clef_pipeline pipeline BertForSequenceClassification from lmajer +author: John Snow Labs +name: bert_base_uncased_finetuned_clef_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_clef_pipeline` is a English model originally trained by lmajer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_clef_pipeline_en_5.5.0_3.0_1727352799390.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_clef_pipeline_en_5.5.0_3.0_1727352799390.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_clef_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_clef_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_clef_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/lmajer/bert-base-uncased-finetuned-CLEF + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_clinc_oos_nickapch_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_clinc_oos_nickapch_en.md new file mode 100644 index 00000000000000..6c5d17b0cdde2f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_clinc_oos_nickapch_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_clinc_oos_nickapch BertForSequenceClassification from nickapch +author: John Snow Labs +name: bert_base_uncased_finetuned_clinc_oos_nickapch +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_clinc_oos_nickapch` is a English model originally trained by nickapch. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_clinc_oos_nickapch_en_5.5.0_3.0_1727332458268.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_clinc_oos_nickapch_en_5.5.0_3.0_1727332458268.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_clinc_oos_nickapch","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_clinc_oos_nickapch", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_clinc_oos_nickapch| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/nickapch/bert-base-uncased-finetuned-clinc_oos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_alemdarberk_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_alemdarberk_en.md new file mode 100644 index 00000000000000..69ee37f3d19d24 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_alemdarberk_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_alemdarberk BertForSequenceClassification from alemdarberk +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_alemdarberk +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_alemdarberk` is a English model originally trained by alemdarberk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_alemdarberk_en_5.5.0_3.0_1727320955188.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_alemdarberk_en_5.5.0_3.0_1727320955188.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_alemdarberk","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_alemdarberk", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_alemdarberk| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/alemdarberk/bert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_ayouta300_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_ayouta300_en.md new file mode 100644 index 00000000000000..e8d6ce6d059a5b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_ayouta300_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_ayouta300 BertForSequenceClassification from Ayouta300 +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_ayouta300 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_ayouta300` is a English model originally trained by Ayouta300. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_ayouta300_en_5.5.0_3.0_1727320765345.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_ayouta300_en_5.5.0_3.0_1727320765345.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_ayouta300","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_ayouta300", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_ayouta300| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Ayouta300/bert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_ayouta300_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_ayouta300_pipeline_en.md new file mode 100644 index 00000000000000..6a4f53aaec29c9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_ayouta300_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_ayouta300_pipeline pipeline BertForSequenceClassification from Ayouta300 +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_ayouta300_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_ayouta300_pipeline` is a English model originally trained by Ayouta300. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_ayouta300_pipeline_en_5.5.0_3.0_1727320786931.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_ayouta300_pipeline_en_5.5.0_3.0_1727320786931.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_cola_ayouta300_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_cola_ayouta300_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_ayouta300_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Ayouta300/bert-base-uncased-finetuned-cola + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_aysin_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_aysin_en.md new file mode 100644 index 00000000000000..b1b1977230b69f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_aysin_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_aysin BertForSequenceClassification from aysin +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_aysin +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_aysin` is a English model originally trained by aysin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_aysin_en_5.5.0_3.0_1727317230986.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_aysin_en_5.5.0_3.0_1727317230986.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_aysin","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_aysin", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_aysin| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/aysin/bert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_aysin_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_aysin_pipeline_en.md new file mode 100644 index 00000000000000..38b5278cc4f45f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_aysin_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_aysin_pipeline pipeline BertForSequenceClassification from aysin +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_aysin_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_aysin_pipeline` is a English model originally trained by aysin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_aysin_pipeline_en_5.5.0_3.0_1727317252886.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_aysin_pipeline_en_5.5.0_3.0_1727317252886.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_cola_aysin_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_cola_aysin_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_aysin_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/aysin/bert-base-uncased-finetuned-cola + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_batch_32_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_batch_32_en.md new file mode 100644 index 00000000000000..48a8eaf5847e8d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_batch_32_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_batch_32 BertForSequenceClassification from cansurav +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_batch_32 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_batch_32` is a English model originally trained by cansurav. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_batch_32_en_5.5.0_3.0_1727316079629.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_batch_32_en_5.5.0_3.0_1727316079629.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_batch_32","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_batch_32", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_batch_32| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/cansurav/bert-base-uncased-finetuned-cola-batch-32 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_batch_64_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_batch_64_en.md new file mode 100644 index 00000000000000..0653af92e619c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_batch_64_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_batch_64 BertForSequenceClassification from cansurav +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_batch_64 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_batch_64` is a English model originally trained by cansurav. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_batch_64_en_5.5.0_3.0_1727321670986.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_batch_64_en_5.5.0_3.0_1727321670986.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_batch_64","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_batch_64", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_batch_64| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/cansurav/bert-base-uncased-finetuned-cola-batch-64 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_bilalkabas_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_bilalkabas_en.md new file mode 100644 index 00000000000000..bb09beec85b477 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_bilalkabas_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_bilalkabas BertForSequenceClassification from bilalkabas +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_bilalkabas +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_bilalkabas` is a English model originally trained by bilalkabas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_bilalkabas_en_5.5.0_3.0_1727322175156.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_bilalkabas_en_5.5.0_3.0_1727322175156.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_bilalkabas","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_bilalkabas", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_bilalkabas| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/bilalkabas/bert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_bonurtek_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_bonurtek_en.md new file mode 100644 index 00000000000000..5953f1e333a511 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_bonurtek_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_bonurtek BertForSequenceClassification from bonurtek +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_bonurtek +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_bonurtek` is a English model originally trained by bonurtek. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_bonurtek_en_5.5.0_3.0_1727351045581.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_bonurtek_en_5.5.0_3.0_1727351045581.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_bonurtek","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_bonurtek", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_bonurtek| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/bonurtek/bert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_bonurtek_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_bonurtek_pipeline_en.md new file mode 100644 index 00000000000000..0c9ca51a77b012 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_bonurtek_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_bonurtek_pipeline pipeline BertForSequenceClassification from bonurtek +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_bonurtek_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_bonurtek_pipeline` is a English model originally trained by bonurtek. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_bonurtek_pipeline_en_5.5.0_3.0_1727351067047.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_bonurtek_pipeline_en_5.5.0_3.0_1727351067047.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_cola_bonurtek_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_cola_bonurtek_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_bonurtek_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/bonurtek/bert-base-uncased-finetuned-cola + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_dropout_0_6_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_dropout_0_6_en.md new file mode 100644 index 00000000000000..0eb6a60e6bcb81 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_dropout_0_6_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_dropout_0_6 BertForSequenceClassification from cansurav +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_dropout_0_6 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_dropout_0_6` is a English model originally trained by cansurav. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_dropout_0_6_en_5.5.0_3.0_1727319204540.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_dropout_0_6_en_5.5.0_3.0_1727319204540.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_dropout_0_6","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_dropout_0_6", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_dropout_0_6| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/cansurav/bert-base-uncased-finetuned-cola-dropout-0.6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_dropout_0_6_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_dropout_0_6_pipeline_en.md new file mode 100644 index 00000000000000..0228374823db1f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_dropout_0_6_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_dropout_0_6_pipeline pipeline BertForSequenceClassification from cansurav +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_dropout_0_6_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_dropout_0_6_pipeline` is a English model originally trained by cansurav. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_dropout_0_6_pipeline_en_5.5.0_3.0_1727319228273.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_dropout_0_6_pipeline_en_5.5.0_3.0_1727319228273.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_cola_dropout_0_6_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_cola_dropout_0_6_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_dropout_0_6_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/cansurav/bert-base-uncased-finetuned-cola-dropout-0.6 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_elifcen_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_elifcen_en.md new file mode 100644 index 00000000000000..a655cfaeadec37 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_elifcen_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_elifcen BertForSequenceClassification from elifcen +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_elifcen +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_elifcen` is a English model originally trained by elifcen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_elifcen_en_5.5.0_3.0_1727352032618.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_elifcen_en_5.5.0_3.0_1727352032618.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_elifcen","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_elifcen", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_elifcen| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/elifcen/bert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_esragenc_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_esragenc_pipeline_en.md new file mode 100644 index 00000000000000..cbb50aa0e02e89 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_esragenc_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_esragenc_pipeline pipeline BertForSequenceClassification from esragenc +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_esragenc_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_esragenc_pipeline` is a English model originally trained by esragenc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_esragenc_pipeline_en_5.5.0_3.0_1727354958620.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_esragenc_pipeline_en_5.5.0_3.0_1727354958620.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_cola_esragenc_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_cola_esragenc_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_esragenc_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/esragenc/bert-base-uncased-finetuned-cola + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_ilkekas_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_ilkekas_en.md new file mode 100644 index 00000000000000..61764116e83089 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_ilkekas_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_ilkekas BertForSequenceClassification from ilkekas +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_ilkekas +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_ilkekas` is a English model originally trained by ilkekas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_ilkekas_en_5.5.0_3.0_1727356258115.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_ilkekas_en_5.5.0_3.0_1727356258115.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_ilkekas","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_ilkekas", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_ilkekas| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ilkekas/bert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_ilkekas_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_ilkekas_pipeline_en.md new file mode 100644 index 00000000000000..639b5a119ac4e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_ilkekas_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_ilkekas_pipeline pipeline BertForSequenceClassification from ilkekas +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_ilkekas_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_ilkekas_pipeline` is a English model originally trained by ilkekas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_ilkekas_pipeline_en_5.5.0_3.0_1727356279671.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_ilkekas_pipeline_en_5.5.0_3.0_1727356279671.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_cola_ilkekas_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_cola_ilkekas_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_ilkekas_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ilkekas/bert-base-uncased-finetuned-cola + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_jxl99_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_jxl99_pipeline_en.md new file mode 100644 index 00000000000000..b9f43ff0dad0e1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_jxl99_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_jxl99_pipeline pipeline BertForSequenceClassification from JXL99 +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_jxl99_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_jxl99_pipeline` is a English model originally trained by JXL99. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_jxl99_pipeline_en_5.5.0_3.0_1727333079323.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_jxl99_pipeline_en_5.5.0_3.0_1727333079323.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_cola_jxl99_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_cola_jxl99_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_jxl99_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/JXL99/bert-base-uncased-finetuned-cola + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_kreola_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_kreola_en.md new file mode 100644 index 00000000000000..960a7cb4f456ab --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_kreola_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_kreola BertForSequenceClassification from kreola +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_kreola +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_kreola` is a English model originally trained by kreola. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_kreola_en_5.5.0_3.0_1727318400114.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_kreola_en_5.5.0_3.0_1727318400114.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_kreola","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_kreola", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_kreola| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/kreola/bert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_kreola_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_kreola_pipeline_en.md new file mode 100644 index 00000000000000..842ddbdfe468a5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_kreola_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_kreola_pipeline pipeline BertForSequenceClassification from kreola +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_kreola_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_kreola_pipeline` is a English model originally trained by kreola. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_kreola_pipeline_en_5.5.0_3.0_1727318422429.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_kreola_pipeline_en_5.5.0_3.0_1727318422429.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_cola_kreola_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_cola_kreola_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_kreola_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/kreola/bert-base-uncased-finetuned-cola + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_learning_rate_0_0001_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_learning_rate_0_0001_en.md new file mode 100644 index 00000000000000..003ee94f10705c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_learning_rate_0_0001_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_learning_rate_0_0001 BertForSequenceClassification from cansurav +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_learning_rate_0_0001 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_learning_rate_0_0001` is a English model originally trained by cansurav. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_learning_rate_0_0001_en_5.5.0_3.0_1727348916152.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_learning_rate_0_0001_en_5.5.0_3.0_1727348916152.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_learning_rate_0_0001","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_learning_rate_0_0001", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_learning_rate_0_0001| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/cansurav/bert-base-uncased-finetuned-cola-learning_rate-0.0001 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_learning_rate_0_0001_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_learning_rate_0_0001_pipeline_en.md new file mode 100644 index 00000000000000..b277830ac730e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_learning_rate_0_0001_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_learning_rate_0_0001_pipeline pipeline BertForSequenceClassification from cansurav +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_learning_rate_0_0001_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_learning_rate_0_0001_pipeline` is a English model originally trained by cansurav. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_learning_rate_0_0001_pipeline_en_5.5.0_3.0_1727348937541.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_learning_rate_0_0001_pipeline_en_5.5.0_3.0_1727348937541.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_cola_learning_rate_0_0001_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_cola_learning_rate_0_0001_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_learning_rate_0_0001_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/cansurav/bert-base-uncased-finetuned-cola-learning_rate-0.0001 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_learning_rate_9e_06_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_learning_rate_9e_06_en.md new file mode 100644 index 00000000000000..f58d997b687f4f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_learning_rate_9e_06_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_learning_rate_9e_06 BertForSequenceClassification from cansurav +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_learning_rate_9e_06 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_learning_rate_9e_06` is a English model originally trained by cansurav. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_learning_rate_9e_06_en_5.5.0_3.0_1727319334769.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_learning_rate_9e_06_en_5.5.0_3.0_1727319334769.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_learning_rate_9e_06","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_learning_rate_9e_06", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_learning_rate_9e_06| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/cansurav/bert-base-uncased-finetuned-cola-learning_rate-9e-06 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_lowkemy_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_lowkemy_en.md new file mode 100644 index 00000000000000..981ea4d374d80c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_lowkemy_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_lowkemy BertForSequenceClassification from lowkemy +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_lowkemy +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_lowkemy` is a English model originally trained by lowkemy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_lowkemy_en_5.5.0_3.0_1727352857333.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_lowkemy_en_5.5.0_3.0_1727352857333.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_lowkemy","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_lowkemy", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_lowkemy| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/lowkemy/bert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_melihberky_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_melihberky_pipeline_en.md new file mode 100644 index 00000000000000..39f85e5e1361fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_melihberky_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_melihberky_pipeline pipeline BertForSequenceClassification from melihberky +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_melihberky_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_melihberky_pipeline` is a English model originally trained by melihberky. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_melihberky_pipeline_en_5.5.0_3.0_1727348416601.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_melihberky_pipeline_en_5.5.0_3.0_1727348416601.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_cola_melihberky_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_cola_melihberky_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_melihberky_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/melihberky/bert-base-uncased-finetuned-cola + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_sepehr_from_server_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_sepehr_from_server_en.md new file mode 100644 index 00000000000000..4460b14d37161e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_sepehr_from_server_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_sepehr_from_server BertForSequenceClassification from sepehrbakhshi +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_sepehr_from_server +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_sepehr_from_server` is a English model originally trained by sepehrbakhshi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_sepehr_from_server_en_5.5.0_3.0_1727318738692.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_sepehr_from_server_en_5.5.0_3.0_1727318738692.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_sepehr_from_server","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_sepehr_from_server", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_sepehr_from_server| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/sepehrbakhshi/bert-base-uncased-finetuned-cola_sepehr_from_server \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_sepehr_from_server_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_sepehr_from_server_pipeline_en.md new file mode 100644 index 00000000000000..fe1a53b66bebd3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_sepehr_from_server_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_sepehr_from_server_pipeline pipeline BertForSequenceClassification from sepehrbakhshi +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_sepehr_from_server_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_sepehr_from_server_pipeline` is a English model originally trained by sepehrbakhshi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_sepehr_from_server_pipeline_en_5.5.0_3.0_1727318760915.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_sepehr_from_server_pipeline_en_5.5.0_3.0_1727318760915.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_cola_sepehr_from_server_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_cola_sepehr_from_server_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_sepehr_from_server_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/sepehrbakhshi/bert-base-uncased-finetuned-cola_sepehr_from_server + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_sepehr_sepehr_sepehr_saturday_nepal_bhasa_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_sepehr_sepehr_sepehr_saturday_nepal_bhasa_en.md new file mode 100644 index 00000000000000..c7c8d9e7deda9c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_sepehr_sepehr_sepehr_saturday_nepal_bhasa_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_sepehr_sepehr_sepehr_saturday_nepal_bhasa BertForSequenceClassification from sepehrbakhshi +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_sepehr_sepehr_sepehr_saturday_nepal_bhasa +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_sepehr_sepehr_sepehr_saturday_nepal_bhasa` is a English model originally trained by sepehrbakhshi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_sepehr_sepehr_sepehr_saturday_nepal_bhasa_en_5.5.0_3.0_1727352575823.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_sepehr_sepehr_sepehr_saturday_nepal_bhasa_en_5.5.0_3.0_1727352575823.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_sepehr_sepehr_sepehr_saturday_nepal_bhasa","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_sepehr_sepehr_sepehr_saturday_nepal_bhasa", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_sepehr_sepehr_sepehr_saturday_nepal_bhasa| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/sepehrbakhshi/bert-base-uncased-finetuned-cola_sepehr_sepehr_sepehr_saturday_new \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_zeynoko_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_zeynoko_en.md new file mode 100644 index 00000000000000..63e461e2636ccd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_zeynoko_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_zeynoko BertForSequenceClassification from Zeynoko +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_zeynoko +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_zeynoko` is a English model originally trained by Zeynoko. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_zeynoko_en_5.5.0_3.0_1727318242846.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_zeynoko_en_5.5.0_3.0_1727318242846.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_zeynoko","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_cola_zeynoko", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_zeynoko| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Zeynoko/bert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_zeynoko_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_zeynoko_pipeline_en.md new file mode 100644 index 00000000000000..3aac19ab615b6a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_cola_zeynoko_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_cola_zeynoko_pipeline pipeline BertForSequenceClassification from Zeynoko +author: John Snow Labs +name: bert_base_uncased_finetuned_cola_zeynoko_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_cola_zeynoko_pipeline` is a English model originally trained by Zeynoko. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_zeynoko_pipeline_en_5.5.0_3.0_1727318268409.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_cola_zeynoko_pipeline_en_5.5.0_3.0_1727318268409.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_cola_zeynoko_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_cola_zeynoko_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_cola_zeynoko_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Zeynoko/bert-base-uncased-finetuned-cola + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_detests_29_10_2022_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_detests_29_10_2022_en.md new file mode 100644 index 00000000000000..c23839b27ec63f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_detests_29_10_2022_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_detests_29_10_2022 BertForSequenceClassification from Pablo94 +author: John Snow Labs +name: bert_base_uncased_finetuned_detests_29_10_2022 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_detests_29_10_2022` is a English model originally trained by Pablo94. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_detests_29_10_2022_en_5.5.0_3.0_1727317162328.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_detests_29_10_2022_en_5.5.0_3.0_1727317162328.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_detests_29_10_2022","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_detests_29_10_2022", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_detests_29_10_2022| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Pablo94/bert-base-uncased-finetuned-detests-29-10-2022 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_dropout_cola_0_8_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_dropout_cola_0_8_en.md new file mode 100644 index 00000000000000..087685140f77b5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_dropout_cola_0_8_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_dropout_cola_0_8 BertForSequenceClassification from yagmurery +author: John Snow Labs +name: bert_base_uncased_finetuned_dropout_cola_0_8 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_dropout_cola_0_8` is a English model originally trained by yagmurery. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_dropout_cola_0_8_en_5.5.0_3.0_1727340994307.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_dropout_cola_0_8_en_5.5.0_3.0_1727340994307.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_dropout_cola_0_8","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_dropout_cola_0_8", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_dropout_cola_0_8| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/yagmurery/bert-base-uncased-finetuned-dropout-cola-0.8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_dropout_cola_0_8_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_dropout_cola_0_8_pipeline_en.md new file mode 100644 index 00000000000000..1005f903fc0c3e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_dropout_cola_0_8_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_dropout_cola_0_8_pipeline pipeline BertForSequenceClassification from yagmurery +author: John Snow Labs +name: bert_base_uncased_finetuned_dropout_cola_0_8_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_dropout_cola_0_8_pipeline` is a English model originally trained by yagmurery. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_dropout_cola_0_8_pipeline_en_5.5.0_3.0_1727341015535.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_dropout_cola_0_8_pipeline_en_5.5.0_3.0_1727341015535.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_dropout_cola_0_8_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_dropout_cola_0_8_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_dropout_cola_0_8_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/yagmurery/bert-base-uncased-finetuned-dropout-cola-0.8 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_filtered_0608_test_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_filtered_0608_test_en.md new file mode 100644 index 00000000000000..2acbe2983a8976 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_filtered_0608_test_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_filtered_0608_test BertForSequenceClassification from YeRyeongLee +author: John Snow Labs +name: bert_base_uncased_finetuned_filtered_0608_test +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_filtered_0608_test` is a English model originally trained by YeRyeongLee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_filtered_0608_test_en_5.5.0_3.0_1727346314253.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_filtered_0608_test_en_5.5.0_3.0_1727346314253.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_filtered_0608_test","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_filtered_0608_test", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_filtered_0608_test| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/YeRyeongLee/bert-base-uncased-finetuned-filtered-0608_test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_filtered_0609_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_filtered_0609_pipeline_en.md new file mode 100644 index 00000000000000..4a2bf423cefbf8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_filtered_0609_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_filtered_0609_pipeline pipeline BertForSequenceClassification from YeRyeongLee +author: John Snow Labs +name: bert_base_uncased_finetuned_filtered_0609_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_filtered_0609_pipeline` is a English model originally trained by YeRyeongLee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_filtered_0609_pipeline_en_5.5.0_3.0_1727346262555.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_filtered_0609_pipeline_en_5.5.0_3.0_1727346262555.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_filtered_0609_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_filtered_0609_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_filtered_0609_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/YeRyeongLee/bert-base-uncased-finetuned-filtered-0609 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_humordetection_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_humordetection_en.md new file mode 100644 index 00000000000000..aec938c871bc8f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_humordetection_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_humordetection BertForSequenceClassification from thanawan +author: John Snow Labs +name: bert_base_uncased_finetuned_humordetection +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_humordetection` is a English model originally trained by thanawan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_humordetection_en_5.5.0_3.0_1727309135954.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_humordetection_en_5.5.0_3.0_1727309135954.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_humordetection","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_humordetection", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_humordetection| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/thanawan/bert-base-uncased-finetuned-humordetection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_humordetection_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_humordetection_pipeline_en.md new file mode 100644 index 00000000000000..7cf17cf9179b21 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_humordetection_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_humordetection_pipeline pipeline BertForSequenceClassification from thanawan +author: John Snow Labs +name: bert_base_uncased_finetuned_humordetection_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_humordetection_pipeline` is a English model originally trained by thanawan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_humordetection_pipeline_en_5.5.0_3.0_1727309158132.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_humordetection_pipeline_en_5.5.0_3.0_1727309158132.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_humordetection_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_humordetection_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_humordetection_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/thanawan/bert-base-uncased-finetuned-humordetection + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_imdb_nikitakapitan_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_imdb_nikitakapitan_pipeline_en.md new file mode 100644 index 00000000000000..bb1b5b1c45b8d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_imdb_nikitakapitan_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_imdb_nikitakapitan_pipeline pipeline BertForSequenceClassification from nikitakapitan +author: John Snow Labs +name: bert_base_uncased_finetuned_imdb_nikitakapitan_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_imdb_nikitakapitan_pipeline` is a English model originally trained by nikitakapitan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_imdb_nikitakapitan_pipeline_en_5.5.0_3.0_1727309531880.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_imdb_nikitakapitan_pipeline_en_5.5.0_3.0_1727309531880.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_imdb_nikitakapitan_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_imdb_nikitakapitan_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_imdb_nikitakapitan_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/nikitakapitan/bert-base-uncased-finetuned-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_mnli_512_5_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_mnli_512_5_en.md new file mode 100644 index 00000000000000..dca8281952e085 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_mnli_512_5_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_mnli_512_5 BertForSequenceClassification from yy642 +author: John Snow Labs +name: bert_base_uncased_finetuned_mnli_512_5 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_mnli_512_5` is a English model originally trained by yy642. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_mnli_512_5_en_5.5.0_3.0_1727313261743.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_mnli_512_5_en_5.5.0_3.0_1727313261743.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_mnli_512_5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_mnli_512_5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_mnli_512_5| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/yy642/bert-base-uncased-finetuned-mnli-512-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_mnli_512_5_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_mnli_512_5_pipeline_en.md new file mode 100644 index 00000000000000..7035d96125bd4f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_mnli_512_5_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_mnli_512_5_pipeline pipeline BertForSequenceClassification from yy642 +author: John Snow Labs +name: bert_base_uncased_finetuned_mnli_512_5_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_mnli_512_5_pipeline` is a English model originally trained by yy642. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_mnli_512_5_pipeline_en_5.5.0_3.0_1727313283790.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_mnli_512_5_pipeline_en_5.5.0_3.0_1727313283790.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_mnli_512_5_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_mnli_512_5_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_mnli_512_5_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/yy642/bert-base-uncased-finetuned-mnli-512-5 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_mnli_minseok0809_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_mnli_minseok0809_en.md new file mode 100644 index 00000000000000..8d929803119cbb --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_mnli_minseok0809_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_mnli_minseok0809 BertForSequenceClassification from minseok0809 +author: John Snow Labs +name: bert_base_uncased_finetuned_mnli_minseok0809 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_mnli_minseok0809` is a English model originally trained by minseok0809. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_mnli_minseok0809_en_5.5.0_3.0_1727339050105.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_mnli_minseok0809_en_5.5.0_3.0_1727339050105.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_mnli_minseok0809","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_mnli_minseok0809", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_mnli_minseok0809| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/minseok0809/bert-base-uncased-finetuned-mnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_mnli_minseok0809_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_mnli_minseok0809_pipeline_en.md new file mode 100644 index 00000000000000..71a44163853210 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_mnli_minseok0809_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_mnli_minseok0809_pipeline pipeline BertForSequenceClassification from minseok0809 +author: John Snow Labs +name: bert_base_uncased_finetuned_mnli_minseok0809_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_mnli_minseok0809_pipeline` is a English model originally trained by minseok0809. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_mnli_minseok0809_pipeline_en_5.5.0_3.0_1727339071854.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_mnli_minseok0809_pipeline_en_5.5.0_3.0_1727339071854.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_mnli_minseok0809_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_mnli_minseok0809_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_mnli_minseok0809_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/minseok0809/bert-base-uncased-finetuned-mnli + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_mnli_rte_wnli_5_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_mnli_rte_wnli_5_pipeline_en.md new file mode 100644 index 00000000000000..ad2e5139692f99 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_mnli_rte_wnli_5_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_mnli_rte_wnli_5_pipeline pipeline BertForSequenceClassification from yy642 +author: John Snow Labs +name: bert_base_uncased_finetuned_mnli_rte_wnli_5_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_mnli_rte_wnli_5_pipeline` is a English model originally trained by yy642. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_mnli_rte_wnli_5_pipeline_en_5.5.0_3.0_1727346362500.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_mnli_rte_wnli_5_pipeline_en_5.5.0_3.0_1727346362500.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_mnli_rte_wnli_5_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_mnli_rte_wnli_5_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_mnli_rte_wnli_5_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/yy642/bert-base-uncased-finetuned-mnli-rte-wnli-5 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_mrpc_minseok0809_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_mrpc_minseok0809_pipeline_en.md new file mode 100644 index 00000000000000..25fd7111fd1078 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_mrpc_minseok0809_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_mrpc_minseok0809_pipeline pipeline BertForSequenceClassification from minseok0809 +author: John Snow Labs +name: bert_base_uncased_finetuned_mrpc_minseok0809_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_mrpc_minseok0809_pipeline` is a English model originally trained by minseok0809. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_mrpc_minseok0809_pipeline_en_5.5.0_3.0_1727345283252.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_mrpc_minseok0809_pipeline_en_5.5.0_3.0_1727345283252.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_mrpc_minseok0809_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_mrpc_minseok0809_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_mrpc_minseok0809_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/minseok0809/bert-base-uncased-finetuned-mrpc + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_mrpc_senfu_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_mrpc_senfu_en.md new file mode 100644 index 00000000000000..a196093c4cf657 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_mrpc_senfu_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_mrpc_senfu BertForSequenceClassification from senfu +author: John Snow Labs +name: bert_base_uncased_finetuned_mrpc_senfu +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_mrpc_senfu` is a English model originally trained by senfu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_mrpc_senfu_en_5.5.0_3.0_1727348540471.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_mrpc_senfu_en_5.5.0_3.0_1727348540471.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_mrpc_senfu","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_mrpc_senfu", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_mrpc_senfu| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/senfu/bert-base-uncased-finetuned-mrpc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_mrpc_senfu_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_mrpc_senfu_pipeline_en.md new file mode 100644 index 00000000000000..726a45c146d2e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_mrpc_senfu_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_mrpc_senfu_pipeline pipeline BertForSequenceClassification from senfu +author: John Snow Labs +name: bert_base_uncased_finetuned_mrpc_senfu_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_mrpc_senfu_pipeline` is a English model originally trained by senfu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_mrpc_senfu_pipeline_en_5.5.0_3.0_1727348562459.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_mrpc_senfu_pipeline_en_5.5.0_3.0_1727348562459.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_mrpc_senfu_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_mrpc_senfu_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_mrpc_senfu_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/senfu/bert-base-uncased-finetuned-mrpc + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_qnli_senfu_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_qnli_senfu_pipeline_en.md new file mode 100644 index 00000000000000..41c0d2d451c257 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_qnli_senfu_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_qnli_senfu_pipeline pipeline BertForSequenceClassification from senfu +author: John Snow Labs +name: bert_base_uncased_finetuned_qnli_senfu_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_qnli_senfu_pipeline` is a English model originally trained by senfu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_qnli_senfu_pipeline_en_5.5.0_3.0_1727355209061.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_qnli_senfu_pipeline_en_5.5.0_3.0_1727355209061.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_qnli_senfu_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_qnli_senfu_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_qnli_senfu_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/senfu/bert-base-uncased-finetuned-qnli + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_qqp_anuj55_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_qqp_anuj55_en.md new file mode 100644 index 00000000000000..e6c2beae9f93f5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_qqp_anuj55_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_qqp_anuj55 BertForSequenceClassification from anuj55 +author: John Snow Labs +name: bert_base_uncased_finetuned_qqp_anuj55 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_qqp_anuj55` is a English model originally trained by anuj55. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_qqp_anuj55_en_5.5.0_3.0_1727353822443.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_qqp_anuj55_en_5.5.0_3.0_1727353822443.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_qqp_anuj55","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_qqp_anuj55", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_qqp_anuj55| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/anuj55/bert-base-uncased-finetuned-qqp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_qqp_w05230505_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_qqp_w05230505_en.md new file mode 100644 index 00000000000000..770f01f21bb8ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_qqp_w05230505_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_qqp_w05230505 BertForSequenceClassification from w05230505 +author: John Snow Labs +name: bert_base_uncased_finetuned_qqp_w05230505 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_qqp_w05230505` is a English model originally trained by w05230505. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_qqp_w05230505_en_5.5.0_3.0_1727338519098.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_qqp_w05230505_en_5.5.0_3.0_1727338519098.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_qqp_w05230505","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_qqp_w05230505", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_qqp_w05230505| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/w05230505/bert-base-uncased-finetuned-qqp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_removed_0529_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_removed_0529_pipeline_en.md new file mode 100644 index 00000000000000..5d2f0806aac1e2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_removed_0529_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_removed_0529_pipeline pipeline BertForSequenceClassification from YeRyeongLee +author: John Snow Labs +name: bert_base_uncased_finetuned_removed_0529_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_removed_0529_pipeline` is a English model originally trained by YeRyeongLee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_removed_0529_pipeline_en_5.5.0_3.0_1727319028384.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_removed_0529_pipeline_en_5.5.0_3.0_1727319028384.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_removed_0529_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_removed_0529_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_removed_0529_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/YeRyeongLee/bert-base-uncased-finetuned-removed-0529 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_removed_0530_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_removed_0530_en.md new file mode 100644 index 00000000000000..e08e48f5a19d1b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_removed_0530_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_removed_0530 BertForSequenceClassification from YeRyeongLee +author: John Snow Labs +name: bert_base_uncased_finetuned_removed_0530 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_removed_0530` is a English model originally trained by YeRyeongLee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_removed_0530_en_5.5.0_3.0_1727347475509.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_removed_0530_en_5.5.0_3.0_1727347475509.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_removed_0530","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_removed_0530", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_removed_0530| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/YeRyeongLee/bert-base-uncased-finetuned-removed-0530 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_rte_max_length_256_epoch_5_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_rte_max_length_256_epoch_5_en.md new file mode 100644 index 00000000000000..888248bdaf5046 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_rte_max_length_256_epoch_5_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_rte_max_length_256_epoch_5 BertForSequenceClassification from yy642 +author: John Snow Labs +name: bert_base_uncased_finetuned_rte_max_length_256_epoch_5 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_rte_max_length_256_epoch_5` is a English model originally trained by yy642. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_rte_max_length_256_epoch_5_en_5.5.0_3.0_1727310863156.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_rte_max_length_256_epoch_5_en_5.5.0_3.0_1727310863156.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_rte_max_length_256_epoch_5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_rte_max_length_256_epoch_5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_rte_max_length_256_epoch_5| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/yy642/bert-base-uncased-finetuned-rte-max-length-256-epoch-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_rte_senfu_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_rte_senfu_en.md new file mode 100644 index 00000000000000..995507716bfcfd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_rte_senfu_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_rte_senfu BertForSequenceClassification from senfu +author: John Snow Labs +name: bert_base_uncased_finetuned_rte_senfu +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_rte_senfu` is a English model originally trained by senfu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_rte_senfu_en_5.5.0_3.0_1727321376664.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_rte_senfu_en_5.5.0_3.0_1727321376664.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_rte_senfu","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_rte_senfu", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_rte_senfu| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/senfu/bert-base-uncased-finetuned-rte \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_rte_senfu_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_rte_senfu_pipeline_en.md new file mode 100644 index 00000000000000..da7b5c052986ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_rte_senfu_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_rte_senfu_pipeline pipeline BertForSequenceClassification from senfu +author: John Snow Labs +name: bert_base_uncased_finetuned_rte_senfu_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_rte_senfu_pipeline` is a English model originally trained by senfu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_rte_senfu_pipeline_en_5.5.0_3.0_1727321402066.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_rte_senfu_pipeline_en_5.5.0_3.0_1727321402066.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_rte_senfu_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_rte_senfu_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_rte_senfu_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/senfu/bert-base-uncased-finetuned-rte + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sdg_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sdg_pipeline_en.md new file mode 100644 index 00000000000000..f2bc02e51b9331 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sdg_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_sdg_pipeline pipeline BertForSequenceClassification from jonas +author: John Snow Labs +name: bert_base_uncased_finetuned_sdg_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_sdg_pipeline` is a English model originally trained by jonas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sdg_pipeline_en_5.5.0_3.0_1727310716679.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sdg_pipeline_en_5.5.0_3.0_1727310716679.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_sdg_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_sdg_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_sdg_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/jonas/bert-base-uncased-finetuned-sdg + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_spam_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_spam_en.md new file mode 100644 index 00000000000000..55430185473dba --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_spam_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_spam BertForSequenceClassification from ana-grassmann +author: John Snow Labs +name: bert_base_uncased_finetuned_spam +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_spam` is a English model originally trained by ana-grassmann. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_spam_en_5.5.0_3.0_1727368582437.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_spam_en_5.5.0_3.0_1727368582437.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_spam","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_spam", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_spam| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ana-grassmann/bert-base-uncased-finetuned-spam \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_spam_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_spam_pipeline_en.md new file mode 100644 index 00000000000000..75ec03911f8c8a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_spam_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_spam_pipeline pipeline BertForSequenceClassification from ana-grassmann +author: John Snow Labs +name: bert_base_uncased_finetuned_spam_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_spam_pipeline` is a English model originally trained by ana-grassmann. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_spam_pipeline_en_5.5.0_3.0_1727368603083.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_spam_pipeline_en_5.5.0_3.0_1727368603083.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_spam_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_spam_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_spam_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ana-grassmann/bert-base-uncased-finetuned-spam + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sql_classification_with_question_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sql_classification_with_question_en.md new file mode 100644 index 00000000000000..dc99226c1bbc19 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sql_classification_with_question_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_sql_classification_with_question BertForSequenceClassification from PatWang +author: John Snow Labs +name: bert_base_uncased_finetuned_sql_classification_with_question +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_sql_classification_with_question` is a English model originally trained by PatWang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sql_classification_with_question_en_5.5.0_3.0_1727339525153.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sql_classification_with_question_en_5.5.0_3.0_1727339525153.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_sql_classification_with_question","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_sql_classification_with_question", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_sql_classification_with_question| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/PatWang/bert-base-uncased-finetuned-sql-classification-with_question \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sst2_axljeremy7_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sst2_axljeremy7_en.md new file mode 100644 index 00000000000000..41850ecd4a537f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sst2_axljeremy7_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_sst2_axljeremy7 BertForSequenceClassification from axljeremy7 +author: John Snow Labs +name: bert_base_uncased_finetuned_sst2_axljeremy7 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_sst2_axljeremy7` is a English model originally trained by axljeremy7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sst2_axljeremy7_en_5.5.0_3.0_1727317612398.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sst2_axljeremy7_en_5.5.0_3.0_1727317612398.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_sst2_axljeremy7","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_sst2_axljeremy7", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_sst2_axljeremy7| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/axljeremy7/bert-base-uncased-finetuned-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sst2_axljeremy7_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sst2_axljeremy7_pipeline_en.md new file mode 100644 index 00000000000000..a4fbac5aa3d79b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sst2_axljeremy7_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_sst2_axljeremy7_pipeline pipeline BertForSequenceClassification from axljeremy7 +author: John Snow Labs +name: bert_base_uncased_finetuned_sst2_axljeremy7_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_sst2_axljeremy7_pipeline` is a English model originally trained by axljeremy7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sst2_axljeremy7_pipeline_en_5.5.0_3.0_1727317635988.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sst2_axljeremy7_pipeline_en_5.5.0_3.0_1727317635988.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_sst2_axljeremy7_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_sst2_axljeremy7_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_sst2_axljeremy7_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/axljeremy7/bert-base-uncased-finetuned-sst2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sst2_pranav4205_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sst2_pranav4205_en.md new file mode 100644 index 00000000000000..022af30c0884f3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sst2_pranav4205_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_sst2_pranav4205 BertForSequenceClassification from pranav4205 +author: John Snow Labs +name: bert_base_uncased_finetuned_sst2_pranav4205 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_sst2_pranav4205` is a English model originally trained by pranav4205. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sst2_pranav4205_en_5.5.0_3.0_1727316524197.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sst2_pranav4205_en_5.5.0_3.0_1727316524197.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_sst2_pranav4205","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_sst2_pranav4205", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_sst2_pranav4205| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/pranav4205/bert-base-uncased-finetuned-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sst2_sasuke_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sst2_sasuke_en.md new file mode 100644 index 00000000000000..32a59cedaf7f33 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sst2_sasuke_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_sst2_sasuke BertForSequenceClassification from sasuke +author: John Snow Labs +name: bert_base_uncased_finetuned_sst2_sasuke +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_sst2_sasuke` is a English model originally trained by sasuke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sst2_sasuke_en_5.5.0_3.0_1727349059086.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sst2_sasuke_en_5.5.0_3.0_1727349059086.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_sst2_sasuke","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_sst2_sasuke", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_sst2_sasuke| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/sasuke/bert-base-uncased-finetuned-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sst2_senfu_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sst2_senfu_en.md new file mode 100644 index 00000000000000..f978fac1ccff9b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sst2_senfu_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_sst2_senfu BertForSequenceClassification from senfu +author: John Snow Labs +name: bert_base_uncased_finetuned_sst2_senfu +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_sst2_senfu` is a English model originally trained by senfu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sst2_senfu_en_5.5.0_3.0_1727348392423.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sst2_senfu_en_5.5.0_3.0_1727348392423.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_sst2_senfu","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_sst2_senfu", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_sst2_senfu| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/senfu/bert-base-uncased-finetuned-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sst2_senfu_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sst2_senfu_pipeline_en.md new file mode 100644 index 00000000000000..2ca6d27f61b7bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sst2_senfu_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_sst2_senfu_pipeline pipeline BertForSequenceClassification from senfu +author: John Snow Labs +name: bert_base_uncased_finetuned_sst2_senfu_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_sst2_senfu_pipeline` is a English model originally trained by senfu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sst2_senfu_pipeline_en_5.5.0_3.0_1727348422264.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sst2_senfu_pipeline_en_5.5.0_3.0_1727348422264.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_sst2_senfu_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_sst2_senfu_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_sst2_senfu_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/senfu/bert-base-uncased-finetuned-sst2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sst2_tech_oriented_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sst2_tech_oriented_en.md new file mode 100644 index 00000000000000..2ed1deca37ac86 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sst2_tech_oriented_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_sst2_tech_oriented BertForSequenceClassification from Tech-oriented +author: John Snow Labs +name: bert_base_uncased_finetuned_sst2_tech_oriented +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_sst2_tech_oriented` is a English model originally trained by Tech-oriented. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sst2_tech_oriented_en_5.5.0_3.0_1727340585188.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sst2_tech_oriented_en_5.5.0_3.0_1727340585188.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_sst2_tech_oriented","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_sst2_tech_oriented", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_sst2_tech_oriented| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Tech-oriented/bert-base-uncased-finetuned-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sst_2_english_r_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sst_2_english_r_en.md new file mode 100644 index 00000000000000..49a765473b807c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_sst_2_english_r_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_sst_2_english_r BertForSequenceClassification from rishikesan +author: John Snow Labs +name: bert_base_uncased_finetuned_sst_2_english_r +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_sst_2_english_r` is a English model originally trained by rishikesan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sst_2_english_r_en_5.5.0_3.0_1727332953113.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sst_2_english_r_en_5.5.0_3.0_1727332953113.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_sst_2_english_r","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_sst_2_english_r", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_sst_2_english_r| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/rishikesan/bert-base-uncased-finetuned-sst-2-english-r \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_stsb_airay_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_stsb_airay_pipeline_en.md new file mode 100644 index 00000000000000..9f9e1361b0ba6b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_stsb_airay_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_stsb_airay_pipeline pipeline BertForSequenceClassification from airay +author: John Snow Labs +name: bert_base_uncased_finetuned_stsb_airay_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_stsb_airay_pipeline` is a English model originally trained by airay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_stsb_airay_pipeline_en_5.5.0_3.0_1727367138450.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_stsb_airay_pipeline_en_5.5.0_3.0_1727367138450.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_stsb_airay_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_stsb_airay_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_stsb_airay_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/airay/bert-base-uncased-finetuned-stsb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_suicide_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_suicide_en.md new file mode 100644 index 00000000000000..45ab977be0a1f3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_suicide_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_suicide BertForSequenceClassification from psychicautomaton +author: John Snow Labs +name: bert_base_uncased_finetuned_suicide +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_suicide` is a English model originally trained by psychicautomaton. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_suicide_en_5.5.0_3.0_1727336591200.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_suicide_en_5.5.0_3.0_1727336591200.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_suicide","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_suicide", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_suicide| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/psychicautomaton/bert-base-uncased-finetuned-suicide \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_suicide_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_suicide_pipeline_en.md new file mode 100644 index 00000000000000..65686f63eecf0e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_suicide_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_suicide_pipeline pipeline BertForSequenceClassification from psychicautomaton +author: John Snow Labs +name: bert_base_uncased_finetuned_suicide_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_suicide_pipeline` is a English model originally trained by psychicautomaton. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_suicide_pipeline_en_5.5.0_3.0_1727336614227.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_suicide_pipeline_en_5.5.0_3.0_1727336614227.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_suicide_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_suicide_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_suicide_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/psychicautomaton/bert-base-uncased-finetuned-suicide + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_t_vendor_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_t_vendor_en.md new file mode 100644 index 00000000000000..13f8b5c5967577 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_t_vendor_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_t_vendor BertForSequenceClassification from Gregorig +author: John Snow Labs +name: bert_base_uncased_finetuned_t_vendor +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_t_vendor` is a English model originally trained by Gregorig. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_t_vendor_en_5.5.0_3.0_1727309572434.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_t_vendor_en_5.5.0_3.0_1727309572434.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_t_vendor","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_t_vendor", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_t_vendor| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Gregorig/bert-base-uncased-finetuned-t_vendor \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_t_vendor_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_t_vendor_pipeline_en.md new file mode 100644 index 00000000000000..b6877648845ec8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_t_vendor_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_t_vendor_pipeline pipeline BertForSequenceClassification from Gregorig +author: John Snow Labs +name: bert_base_uncased_finetuned_t_vendor_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_t_vendor_pipeline` is a English model originally trained by Gregorig. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_t_vendor_pipeline_en_5.5.0_3.0_1727309593692.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_t_vendor_pipeline_en_5.5.0_3.0_1727309593692.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_t_vendor_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_t_vendor_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_t_vendor_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Gregorig/bert-base-uncased-finetuned-t_vendor + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_wnli_sujatha2502_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_wnli_sujatha2502_en.md new file mode 100644 index 00000000000000..63a54bb3be584d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_finetuned_wnli_sujatha2502_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_wnli_sujatha2502 BertForSequenceClassification from sujatha2502 +author: John Snow Labs +name: bert_base_uncased_finetuned_wnli_sujatha2502 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_wnli_sujatha2502` is a English model originally trained by sujatha2502. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_wnli_sujatha2502_en_5.5.0_3.0_1727319895842.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_wnli_sujatha2502_en_5.5.0_3.0_1727319895842.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_wnli_sujatha2502","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_wnli_sujatha2502", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_wnli_sujatha2502| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/sujatha2502/bert-base-uncased-finetuned-wnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_ft_news_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_ft_news_en.md new file mode 100644 index 00000000000000..7294a131c537ca --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_ft_news_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_ft_news BertForSequenceClassification from MatFil99 +author: John Snow Labs +name: bert_base_uncased_ft_news +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_ft_news` is a English model originally trained by MatFil99. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_ft_news_en_5.5.0_3.0_1727342860640.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_ft_news_en_5.5.0_3.0_1727342860640.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_ft_news","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_ft_news", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_ft_news| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/MatFil99/bert-base-uncased-ft-news \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_ft_news_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_ft_news_pipeline_en.md new file mode 100644 index 00000000000000..68038ae546b8da --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_ft_news_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_ft_news_pipeline pipeline BertForSequenceClassification from MatFil99 +author: John Snow Labs +name: bert_base_uncased_ft_news_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_ft_news_pipeline` is a English model originally trained by MatFil99. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_ft_news_pipeline_en_5.5.0_3.0_1727342881727.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_ft_news_pipeline_en_5.5.0_3.0_1727342881727.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_ft_news_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_ft_news_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_ft_news_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/MatFil99/bert-base-uncased-ft-news + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_glue_mrpc_camilovg_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_glue_mrpc_camilovg_en.md new file mode 100644 index 00000000000000..93e76b8c7a8dbb --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_glue_mrpc_camilovg_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_glue_mrpc_camilovg BertForSequenceClassification from camiloTel0410 +author: John Snow Labs +name: bert_base_uncased_glue_mrpc_camilovg +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_glue_mrpc_camilovg` is a English model originally trained by camiloTel0410. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_glue_mrpc_camilovg_en_5.5.0_3.0_1727369451949.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_glue_mrpc_camilovg_en_5.5.0_3.0_1727369451949.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_glue_mrpc_camilovg","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_glue_mrpc_camilovg", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_glue_mrpc_camilovg| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/camiloTel0410/bert-base-uncased-glue-mrpc-camilovg \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_glue_mrpc_camilovg_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_glue_mrpc_camilovg_pipeline_en.md new file mode 100644 index 00000000000000..2302170b66a022 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_glue_mrpc_camilovg_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_glue_mrpc_camilovg_pipeline pipeline BertForSequenceClassification from camiloTel0410 +author: John Snow Labs +name: bert_base_uncased_glue_mrpc_camilovg_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_glue_mrpc_camilovg_pipeline` is a English model originally trained by camiloTel0410. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_glue_mrpc_camilovg_pipeline_en_5.5.0_3.0_1727369472482.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_glue_mrpc_camilovg_pipeline_en_5.5.0_3.0_1727369472482.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_glue_mrpc_camilovg_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_glue_mrpc_camilovg_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_glue_mrpc_camilovg_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/camiloTel0410/bert-base-uncased-glue-mrpc-camilovg + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_google_boolq_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_google_boolq_en.md new file mode 100644 index 00000000000000..49f1500e010360 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_google_boolq_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_google_boolq BertForSequenceClassification from pranay-j +author: John Snow Labs +name: bert_base_uncased_google_boolq +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_google_boolq` is a English model originally trained by pranay-j. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_google_boolq_en_5.5.0_3.0_1727349693702.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_google_boolq_en_5.5.0_3.0_1727349693702.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_google_boolq","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_google_boolq", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_google_boolq| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/pranay-j/bert-base-uncased-google-boolq \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_google_boolq_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_google_boolq_pipeline_en.md new file mode 100644 index 00000000000000..b0a91f03f6e67f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_google_boolq_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_google_boolq_pipeline pipeline BertForSequenceClassification from pranay-j +author: John Snow Labs +name: bert_base_uncased_google_boolq_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_google_boolq_pipeline` is a English model originally trained by pranay-j. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_google_boolq_pipeline_en_5.5.0_3.0_1727349715135.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_google_boolq_pipeline_en_5.5.0_3.0_1727349715135.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_google_boolq_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_google_boolq_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_google_boolq_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/pranay-j/bert-base-uncased-google-boolq + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_grouped_textsim_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_grouped_textsim_en.md new file mode 100644 index 00000000000000..045adb232c1675 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_grouped_textsim_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_grouped_textsim BertForSequenceClassification from kaanakdeniz +author: John Snow Labs +name: bert_base_uncased_grouped_textsim +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_grouped_textsim` is a English model originally trained by kaanakdeniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_grouped_textsim_en_5.5.0_3.0_1727320956680.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_grouped_textsim_en_5.5.0_3.0_1727320956680.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_grouped_textsim","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_grouped_textsim", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_grouped_textsim| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/kaanakdeniz/bert_base_uncased_grouped_textsim \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_grouped_textsim_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_grouped_textsim_pipeline_en.md new file mode 100644 index 00000000000000..3d3eadbba221ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_grouped_textsim_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_grouped_textsim_pipeline pipeline BertForSequenceClassification from kaanakdeniz +author: John Snow Labs +name: bert_base_uncased_grouped_textsim_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_grouped_textsim_pipeline` is a English model originally trained by kaanakdeniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_grouped_textsim_pipeline_en_5.5.0_3.0_1727320980488.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_grouped_textsim_pipeline_en_5.5.0_3.0_1727320980488.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_grouped_textsim_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_grouped_textsim_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_grouped_textsim_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/kaanakdeniz/bert_base_uncased_grouped_textsim + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_hatexplain_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_hatexplain_pipeline_en.md new file mode 100644 index 00000000000000..976708bd6f91e1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_hatexplain_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_hatexplain_pipeline pipeline BertForSequenceClassification from Kanit +author: John Snow Labs +name: bert_base_uncased_hatexplain_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_hatexplain_pipeline` is a English model originally trained by Kanit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_hatexplain_pipeline_en_5.5.0_3.0_1727344611284.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_hatexplain_pipeline_en_5.5.0_3.0_1727344611284.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_hatexplain_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_hatexplain_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_hatexplain_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Kanit/bert-base-uncased-hateXplain + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_hdb_0420_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_hdb_0420_en.md new file mode 100644 index 00000000000000..d791b3b547e0a1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_hdb_0420_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_hdb_0420 BertForSequenceClassification from cestwc +author: John Snow Labs +name: bert_base_uncased_hdb_0420 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_hdb_0420` is a English model originally trained by cestwc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_hdb_0420_en_5.5.0_3.0_1727312606364.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_hdb_0420_en_5.5.0_3.0_1727312606364.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_hdb_0420","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_hdb_0420", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_hdb_0420| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/cestwc/bert-base-uncased-hdb-0420 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_header_plus_textsim_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_header_plus_textsim_pipeline_en.md new file mode 100644 index 00000000000000..a840001fd20553 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_header_plus_textsim_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_header_plus_textsim_pipeline pipeline BertForSequenceClassification from kaanakdeniz +author: John Snow Labs +name: bert_base_uncased_header_plus_textsim_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_header_plus_textsim_pipeline` is a English model originally trained by kaanakdeniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_header_plus_textsim_pipeline_en_5.5.0_3.0_1727350045829.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_header_plus_textsim_pipeline_en_5.5.0_3.0_1727350045829.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_header_plus_textsim_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_header_plus_textsim_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_header_plus_textsim_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/kaanakdeniz/bert_base_uncased_header_plus_textsim + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_hoax_classifier_v3_defs_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_hoax_classifier_v3_defs_en.md new file mode 100644 index 00000000000000..900aa0c75bc1d9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_hoax_classifier_v3_defs_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_hoax_classifier_v3_defs BertForSequenceClassification from research-dump +author: John Snow Labs +name: bert_base_uncased_hoax_classifier_v3_defs +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_hoax_classifier_v3_defs` is a English model originally trained by research-dump. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_hoax_classifier_v3_defs_en_5.5.0_3.0_1727349110458.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_hoax_classifier_v3_defs_en_5.5.0_3.0_1727349110458.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_hoax_classifier_v3_defs","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_hoax_classifier_v3_defs", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_hoax_classifier_v3_defs| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/research-dump/bert-base-uncased_hoax_classifier_v3_defs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_imdb_trained_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_imdb_trained_pipeline_en.md new file mode 100644 index 00000000000000..bdc5193b82f0dc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_imdb_trained_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_imdb_trained_pipeline pipeline BertForSequenceClassification from JakobKaiser +author: John Snow Labs +name: bert_base_uncased_imdb_trained_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_imdb_trained_pipeline` is a English model originally trained by JakobKaiser. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_imdb_trained_pipeline_en_5.5.0_3.0_1727341228254.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_imdb_trained_pipeline_en_5.5.0_3.0_1727341228254.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_imdb_trained_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_imdb_trained_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_imdb_trained_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/JakobKaiser/bert-base-uncased-imdb-trained + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_jigsaw_toxic_classifier_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_jigsaw_toxic_classifier_en.md new file mode 100644 index 00000000000000..64030be8af9b59 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_jigsaw_toxic_classifier_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_jigsaw_toxic_classifier BertForSequenceClassification from berkaysahiin +author: John Snow Labs +name: bert_base_uncased_jigsaw_toxic_classifier +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_jigsaw_toxic_classifier` is a English model originally trained by berkaysahiin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_jigsaw_toxic_classifier_en_5.5.0_3.0_1727348920837.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_jigsaw_toxic_classifier_en_5.5.0_3.0_1727348920837.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_jigsaw_toxic_classifier","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_jigsaw_toxic_classifier", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_jigsaw_toxic_classifier| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/berkaysahiin/bert-base-uncased-jigsaw-toxic-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_jigsaw_toxic_classifier_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_jigsaw_toxic_classifier_pipeline_en.md new file mode 100644 index 00000000000000..aa1dcc7b8b756e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_jigsaw_toxic_classifier_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_jigsaw_toxic_classifier_pipeline pipeline BertForSequenceClassification from berkaysahiin +author: John Snow Labs +name: bert_base_uncased_jigsaw_toxic_classifier_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_jigsaw_toxic_classifier_pipeline` is a English model originally trained by berkaysahiin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_jigsaw_toxic_classifier_pipeline_en_5.5.0_3.0_1727348942064.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_jigsaw_toxic_classifier_pipeline_en_5.5.0_3.0_1727348942064.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_jigsaw_toxic_classifier_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_jigsaw_toxic_classifier_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_jigsaw_toxic_classifier_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/berkaysahiin/bert-base-uncased-jigsaw-toxic-classifier + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_kaggle_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_kaggle_en.md new file mode 100644 index 00000000000000..02189e2dff34d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_kaggle_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_kaggle BertForSequenceClassification from howey +author: John Snow Labs +name: bert_base_uncased_kaggle +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_kaggle` is a English model originally trained by howey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_kaggle_en_5.5.0_3.0_1727313859962.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_kaggle_en_5.5.0_3.0_1727313859962.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_kaggle","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_kaggle", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_kaggle| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/howey/bert-base-uncased-kaggle \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_mnli_epochs_2_lr_1e_05_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_mnli_epochs_2_lr_1e_05_en.md new file mode 100644 index 00000000000000..b21e4e23ebc779 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_mnli_epochs_2_lr_1e_05_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_mnli_epochs_2_lr_1e_05 BertForSequenceClassification from prateeky2806 +author: John Snow Labs +name: bert_base_uncased_mnli_epochs_2_lr_1e_05 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_mnli_epochs_2_lr_1e_05` is a English model originally trained by prateeky2806. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mnli_epochs_2_lr_1e_05_en_5.5.0_3.0_1727337158342.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mnli_epochs_2_lr_1e_05_en_5.5.0_3.0_1727337158342.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_mnli_epochs_2_lr_1e_05","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_mnli_epochs_2_lr_1e_05", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_mnli_epochs_2_lr_1e_05| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/prateeky2806/bert-base-uncased-mnli-epochs-2-lr-1e-05 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_mnli_howey_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_mnli_howey_en.md new file mode 100644 index 00000000000000..36fb737f8c08b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_mnli_howey_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_mnli_howey BertForSequenceClassification from howey +author: John Snow Labs +name: bert_base_uncased_mnli_howey +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_mnli_howey` is a English model originally trained by howey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mnli_howey_en_5.5.0_3.0_1727317014081.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mnli_howey_en_5.5.0_3.0_1727317014081.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_mnli_howey","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_mnli_howey", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_mnli_howey| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/howey/bert-base-uncased-mnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_mnli_howey_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_mnli_howey_pipeline_en.md new file mode 100644 index 00000000000000..e0a5309b820a94 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_mnli_howey_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_mnli_howey_pipeline pipeline BertForSequenceClassification from howey +author: John Snow Labs +name: bert_base_uncased_mnli_howey_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_mnli_howey_pipeline` is a English model originally trained by howey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mnli_howey_pipeline_en_5.5.0_3.0_1727317035421.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mnli_howey_pipeline_en_5.5.0_3.0_1727317035421.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_mnli_howey_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_mnli_howey_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_mnli_howey_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/howey/bert-base-uncased-mnli + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_mrpc_howey_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_mrpc_howey_en.md new file mode 100644 index 00000000000000..c7f942beaccdd7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_mrpc_howey_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_mrpc_howey BertForSequenceClassification from howey +author: John Snow Labs +name: bert_base_uncased_mrpc_howey +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_mrpc_howey` is a English model originally trained by howey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mrpc_howey_en_5.5.0_3.0_1727342114497.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mrpc_howey_en_5.5.0_3.0_1727342114497.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_mrpc_howey","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_mrpc_howey", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_mrpc_howey| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/howey/bert-base-uncased-mrpc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_mrpc_howey_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_mrpc_howey_pipeline_en.md new file mode 100644 index 00000000000000..3a40e8052aa04a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_mrpc_howey_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_mrpc_howey_pipeline pipeline BertForSequenceClassification from howey +author: John Snow Labs +name: bert_base_uncased_mrpc_howey_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_mrpc_howey_pipeline` is a English model originally trained by howey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mrpc_howey_pipeline_en_5.5.0_3.0_1727342136498.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mrpc_howey_pipeline_en_5.5.0_3.0_1727342136498.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_mrpc_howey_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_mrpc_howey_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_mrpc_howey_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/howey/bert-base-uncased-mrpc + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_news_about_gold_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_news_about_gold_en.md new file mode 100644 index 00000000000000..2ed3946a7b6aea --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_news_about_gold_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_news_about_gold BertForSequenceClassification from DunnBC22 +author: John Snow Labs +name: bert_base_uncased_news_about_gold +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_news_about_gold` is a English model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_news_about_gold_en_5.5.0_3.0_1727346464577.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_news_about_gold_en_5.5.0_3.0_1727346464577.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_news_about_gold","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_news_about_gold", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_news_about_gold| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/DunnBC22/bert-base-uncased-News_About_Gold \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_newscategoryclassification_fullmodel_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_newscategoryclassification_fullmodel_en.md new file mode 100644 index 00000000000000..f5925f9ceaabb6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_newscategoryclassification_fullmodel_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_newscategoryclassification_fullmodel BertForSequenceClassification from akashmaggon +author: John Snow Labs +name: bert_base_uncased_newscategoryclassification_fullmodel +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_newscategoryclassification_fullmodel` is a English model originally trained by akashmaggon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_newscategoryclassification_fullmodel_en_5.5.0_3.0_1727349570537.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_newscategoryclassification_fullmodel_en_5.5.0_3.0_1727349570537.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_newscategoryclassification_fullmodel","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_newscategoryclassification_fullmodel", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_newscategoryclassification_fullmodel| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/akashmaggon/bert-base-uncased-newscategoryclassification-fullmodel \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_newscategoryclassification_fullmodel_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_newscategoryclassification_fullmodel_pipeline_en.md new file mode 100644 index 00000000000000..3c281bedec2365 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_newscategoryclassification_fullmodel_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_newscategoryclassification_fullmodel_pipeline pipeline BertForSequenceClassification from akashmaggon +author: John Snow Labs +name: bert_base_uncased_newscategoryclassification_fullmodel_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_newscategoryclassification_fullmodel_pipeline` is a English model originally trained by akashmaggon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_newscategoryclassification_fullmodel_pipeline_en_5.5.0_3.0_1727349593293.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_newscategoryclassification_fullmodel_pipeline_en_5.5.0_3.0_1727349593293.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_newscategoryclassification_fullmodel_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_newscategoryclassification_fullmodel_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_newscategoryclassification_fullmodel_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/akashmaggon/bert-base-uncased-newscategoryclassification-fullmodel + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_offenseval2019_downsample_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_offenseval2019_downsample_en.md new file mode 100644 index 00000000000000..864a2ea040a81d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_offenseval2019_downsample_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_offenseval2019_downsample BertForSequenceClassification from mohsenfayyaz +author: John Snow Labs +name: bert_base_uncased_offenseval2019_downsample +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_offenseval2019_downsample` is a English model originally trained by mohsenfayyaz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_offenseval2019_downsample_en_5.5.0_3.0_1727351956024.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_offenseval2019_downsample_en_5.5.0_3.0_1727351956024.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_offenseval2019_downsample","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_offenseval2019_downsample", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_offenseval2019_downsample| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/mohsenfayyaz/bert-base-uncased-offenseval2019-downsample \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_offenseval2019_downsample_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_offenseval2019_downsample_pipeline_en.md new file mode 100644 index 00000000000000..8e65d5cf42862c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_offenseval2019_downsample_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_offenseval2019_downsample_pipeline pipeline BertForSequenceClassification from mohsenfayyaz +author: John Snow Labs +name: bert_base_uncased_offenseval2019_downsample_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_offenseval2019_downsample_pipeline` is a English model originally trained by mohsenfayyaz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_offenseval2019_downsample_pipeline_en_5.5.0_3.0_1727351977819.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_offenseval2019_downsample_pipeline_en_5.5.0_3.0_1727351977819.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_offenseval2019_downsample_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_offenseval2019_downsample_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_offenseval2019_downsample_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/mohsenfayyaz/bert-base-uncased-offenseval2019-downsample + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_polarizeai_1_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_polarizeai_1_en.md new file mode 100644 index 00000000000000..aa3059f61fd217 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_polarizeai_1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_polarizeai_1 BertForSequenceClassification from nisandij +author: John Snow Labs +name: bert_base_uncased_polarizeai_1 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_polarizeai_1` is a English model originally trained by nisandij. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_polarizeai_1_en_5.5.0_3.0_1727317262275.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_polarizeai_1_en_5.5.0_3.0_1727317262275.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_polarizeai_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_polarizeai_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_polarizeai_1| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/nisandij/bert-base-uncased-PolarizeAI-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_polarizeai_1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_polarizeai_1_pipeline_en.md new file mode 100644 index 00000000000000..e87b1133f64b2b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_polarizeai_1_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_polarizeai_1_pipeline pipeline BertForSequenceClassification from nisandij +author: John Snow Labs +name: bert_base_uncased_polarizeai_1_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_polarizeai_1_pipeline` is a English model originally trained by nisandij. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_polarizeai_1_pipeline_en_5.5.0_3.0_1727317284009.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_polarizeai_1_pipeline_en_5.5.0_3.0_1727317284009.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_polarizeai_1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_polarizeai_1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_polarizeai_1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/nisandij/bert-base-uncased-PolarizeAI-1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_qnli_from_bert_large_uncased_qnli_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_qnli_from_bert_large_uncased_qnli_en.md new file mode 100644 index 00000000000000..be7790f9918283 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_qnli_from_bert_large_uncased_qnli_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_qnli_from_bert_large_uncased_qnli BertForSequenceClassification from yoshitomo-matsubara +author: John Snow Labs +name: bert_base_uncased_qnli_from_bert_large_uncased_qnli +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_qnli_from_bert_large_uncased_qnli` is a English model originally trained by yoshitomo-matsubara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_qnli_from_bert_large_uncased_qnli_en_5.5.0_3.0_1727337794139.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_qnli_from_bert_large_uncased_qnli_en_5.5.0_3.0_1727337794139.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_qnli_from_bert_large_uncased_qnli","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_qnli_from_bert_large_uncased_qnli", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_qnli_from_bert_large_uncased_qnli| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/yoshitomo-matsubara/bert-base-uncased-qnli_from_bert-large-uncased-qnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_qqp_howey_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_qqp_howey_en.md new file mode 100644 index 00000000000000..d1d55c818da456 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_qqp_howey_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_qqp_howey BertForSequenceClassification from howey +author: John Snow Labs +name: bert_base_uncased_qqp_howey +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_qqp_howey` is a English model originally trained by howey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_qqp_howey_en_5.5.0_3.0_1727347980738.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_qqp_howey_en_5.5.0_3.0_1727347980738.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_qqp_howey","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_qqp_howey", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_qqp_howey| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/howey/bert-base-uncased-qqp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_qqp_howey_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_qqp_howey_pipeline_en.md new file mode 100644 index 00000000000000..5a7d7e7836e5e2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_qqp_howey_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_qqp_howey_pipeline pipeline BertForSequenceClassification from howey +author: John Snow Labs +name: bert_base_uncased_qqp_howey_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_qqp_howey_pipeline` is a English model originally trained by howey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_qqp_howey_pipeline_en_5.5.0_3.0_1727348002466.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_qqp_howey_pipeline_en_5.5.0_3.0_1727348002466.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_qqp_howey_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_qqp_howey_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_qqp_howey_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/howey/bert-base-uncased-qqp + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_rte_epochs_10_lr_1e_05_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_rte_epochs_10_lr_1e_05_pipeline_en.md new file mode 100644 index 00000000000000..0f0f2a842c842a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_rte_epochs_10_lr_1e_05_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_rte_epochs_10_lr_1e_05_pipeline pipeline BertForSequenceClassification from prateeky2806 +author: John Snow Labs +name: bert_base_uncased_rte_epochs_10_lr_1e_05_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_rte_epochs_10_lr_1e_05_pipeline` is a English model originally trained by prateeky2806. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_rte_epochs_10_lr_1e_05_pipeline_en_5.5.0_3.0_1727338641132.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_rte_epochs_10_lr_1e_05_pipeline_en_5.5.0_3.0_1727338641132.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_rte_epochs_10_lr_1e_05_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_rte_epochs_10_lr_1e_05_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_rte_epochs_10_lr_1e_05_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/prateeky2806/bert-base-uncased-rte-epochs-10-lr-1e-05 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_spam_classifier_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_spam_classifier_en.md new file mode 100644 index 00000000000000..62b5374a38abfc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_spam_classifier_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_spam_classifier BertForSequenceClassification from pritihalder98 +author: John Snow Labs +name: bert_base_uncased_spam_classifier +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_spam_classifier` is a English model originally trained by pritihalder98. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_spam_classifier_en_5.5.0_3.0_1727315324534.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_spam_classifier_en_5.5.0_3.0_1727315324534.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_spam_classifier","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_spam_classifier", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_spam_classifier| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|627.7 MB| + +## References + +https://huggingface.co/pritihalder98/bert-base-uncased-spam-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_spam_classifier_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_spam_classifier_pipeline_en.md new file mode 100644 index 00000000000000..1970d5e0140c03 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_spam_classifier_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_spam_classifier_pipeline pipeline BertForSequenceClassification from pritihalder98 +author: John Snow Labs +name: bert_base_uncased_spam_classifier_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_spam_classifier_pipeline` is a English model originally trained by pritihalder98. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_spam_classifier_pipeline_en_5.5.0_3.0_1727315358578.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_spam_classifier_pipeline_en_5.5.0_3.0_1727315358578.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_spam_classifier_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_spam_classifier_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_spam_classifier_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|627.8 MB| + +## References + +https://huggingface.co/pritihalder98/bert-base-uncased-spam-classifier + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_sst2_kowsiknd_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_sst2_kowsiknd_en.md new file mode 100644 index 00000000000000..4769e3d70b9ea4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_sst2_kowsiknd_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_sst2_kowsiknd BertForSequenceClassification from kowsiknd +author: John Snow Labs +name: bert_base_uncased_sst2_kowsiknd +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_sst2_kowsiknd` is a English model originally trained by kowsiknd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst2_kowsiknd_en_5.5.0_3.0_1727348833223.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst2_kowsiknd_en_5.5.0_3.0_1727348833223.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_sst2_kowsiknd","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_sst2_kowsiknd", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_sst2_kowsiknd| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/kowsiknd/bert-base-uncased-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_sst2_nncf_unstructured_sparse_80_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_sst2_nncf_unstructured_sparse_80_en.md new file mode 100644 index 00000000000000..02f679baec72ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_sst2_nncf_unstructured_sparse_80_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_sst2_nncf_unstructured_sparse_80 BertForSequenceClassification from yujiepan +author: John Snow Labs +name: bert_base_uncased_sst2_nncf_unstructured_sparse_80 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_sst2_nncf_unstructured_sparse_80` is a English model originally trained by yujiepan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst2_nncf_unstructured_sparse_80_en_5.5.0_3.0_1727317487439.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst2_nncf_unstructured_sparse_80_en_5.5.0_3.0_1727317487439.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_sst2_nncf_unstructured_sparse_80","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_sst2_nncf_unstructured_sparse_80", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_sst2_nncf_unstructured_sparse_80| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.9 MB| + +## References + +https://huggingface.co/yujiepan/bert-base-uncased-sst2-NNCF-unstructured-sparse-80 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_sst2_nncf_unstructured_sparse_80_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_sst2_nncf_unstructured_sparse_80_pipeline_en.md new file mode 100644 index 00000000000000..82627585ead1b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_sst2_nncf_unstructured_sparse_80_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_sst2_nncf_unstructured_sparse_80_pipeline pipeline BertForSequenceClassification from yujiepan +author: John Snow Labs +name: bert_base_uncased_sst2_nncf_unstructured_sparse_80_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_sst2_nncf_unstructured_sparse_80_pipeline` is a English model originally trained by yujiepan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst2_nncf_unstructured_sparse_80_pipeline_en_5.5.0_3.0_1727317513786.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst2_nncf_unstructured_sparse_80_pipeline_en_5.5.0_3.0_1727317513786.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_sst2_nncf_unstructured_sparse_80_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_sst2_nncf_unstructured_sparse_80_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_sst2_nncf_unstructured_sparse_80_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|408.9 MB| + +## References + +https://huggingface.co/yujiepan/bert-base-uncased-sst2-NNCF-unstructured-sparse-80 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_symptom_tonga_tonga_islands_diagnosis_trained_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_symptom_tonga_tonga_islands_diagnosis_trained_en.md new file mode 100644 index 00000000000000..dac184d7b6015e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_symptom_tonga_tonga_islands_diagnosis_trained_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_symptom_tonga_tonga_islands_diagnosis_trained BertForSequenceClassification from JakobKaiser +author: John Snow Labs +name: bert_base_uncased_symptom_tonga_tonga_islands_diagnosis_trained +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_symptom_tonga_tonga_islands_diagnosis_trained` is a English model originally trained by JakobKaiser. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_symptom_tonga_tonga_islands_diagnosis_trained_en_5.5.0_3.0_1727344253600.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_symptom_tonga_tonga_islands_diagnosis_trained_en_5.5.0_3.0_1727344253600.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_symptom_tonga_tonga_islands_diagnosis_trained","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_symptom_tonga_tonga_islands_diagnosis_trained", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_symptom_tonga_tonga_islands_diagnosis_trained| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/JakobKaiser/bert-base-uncased-symptom_to_diagnosis-trained \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_tense_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_tense_pipeline_en.md new file mode 100644 index 00000000000000..baf340fac0b3f9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_tense_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_tense_pipeline pipeline BertForSequenceClassification from freethenation +author: John Snow Labs +name: bert_base_uncased_tense_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_tense_pipeline` is a English model originally trained by freethenation. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_tense_pipeline_en_5.5.0_3.0_1727351301211.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_tense_pipeline_en_5.5.0_3.0_1727351301211.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_tense_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_tense_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_tense_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/freethenation/bert-base-uncased-tense + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_textcls_rheology_20230912_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_textcls_rheology_20230912_en.md new file mode 100644 index 00000000000000..84c1e4a36b736d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_textcls_rheology_20230912_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_textcls_rheology_20230912 BertForSequenceClassification from jonas-luehrs +author: John Snow Labs +name: bert_base_uncased_textcls_rheology_20230912 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_textcls_rheology_20230912` is a English model originally trained by jonas-luehrs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_textcls_rheology_20230912_en_5.5.0_3.0_1727370204620.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_textcls_rheology_20230912_en_5.5.0_3.0_1727370204620.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_textcls_rheology_20230912","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_textcls_rheology_20230912", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_textcls_rheology_20230912| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jonas-luehrs/bert-base-uncased-textCLS-RHEOLOGY-20230912 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_textcls_rheology_20230913_1_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_textcls_rheology_20230913_1_en.md new file mode 100644 index 00000000000000..cdccc383e1d4b8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_textcls_rheology_20230913_1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_textcls_rheology_20230913_1 BertForSequenceClassification from jonas-luehrs +author: John Snow Labs +name: bert_base_uncased_textcls_rheology_20230913_1 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_textcls_rheology_20230913_1` is a English model originally trained by jonas-luehrs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_textcls_rheology_20230913_1_en_5.5.0_3.0_1727330406111.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_textcls_rheology_20230913_1_en_5.5.0_3.0_1727330406111.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_textcls_rheology_20230913_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_textcls_rheology_20230913_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_textcls_rheology_20230913_1| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jonas-luehrs/bert-base-uncased-textCLS-RHEOLOGY-20230913-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_textcls_rheology_20230913_1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_textcls_rheology_20230913_1_pipeline_en.md new file mode 100644 index 00000000000000..23d5a82459dc4e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_textcls_rheology_20230913_1_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_textcls_rheology_20230913_1_pipeline pipeline BertForSequenceClassification from jonas-luehrs +author: John Snow Labs +name: bert_base_uncased_textcls_rheology_20230913_1_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_textcls_rheology_20230913_1_pipeline` is a English model originally trained by jonas-luehrs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_textcls_rheology_20230913_1_pipeline_en_5.5.0_3.0_1727330427257.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_textcls_rheology_20230913_1_pipeline_en_5.5.0_3.0_1727330427257.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_textcls_rheology_20230913_1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_textcls_rheology_20230913_1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_textcls_rheology_20230913_1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jonas-luehrs/bert-base-uncased-textCLS-RHEOLOGY-20230913-1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_tminer_hs_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_tminer_hs_en.md new file mode 100644 index 00000000000000..7898ac3968bedf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_tminer_hs_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_tminer_hs BertForSequenceClassification from nealmgkr +author: John Snow Labs +name: bert_base_uncased_tminer_hs +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_tminer_hs` is a English model originally trained by nealmgkr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_tminer_hs_en_5.5.0_3.0_1727311409279.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_tminer_hs_en_5.5.0_3.0_1727311409279.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_tminer_hs","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_tminer_hs", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_tminer_hs| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/nealmgkr/bert-base-uncased-tminer-hs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_top_pruned_qqp_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_top_pruned_qqp_pipeline_en.md new file mode 100644 index 00000000000000..07c865c721ac81 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_top_pruned_qqp_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_top_pruned_qqp_pipeline pipeline BertForSequenceClassification from senfu +author: John Snow Labs +name: bert_base_uncased_top_pruned_qqp_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_top_pruned_qqp_pipeline` is a English model originally trained by senfu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_top_pruned_qqp_pipeline_en_5.5.0_3.0_1727356013721.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_top_pruned_qqp_pipeline_en_5.5.0_3.0_1727356013721.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_top_pruned_qqp_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_top_pruned_qqp_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_top_pruned_qqp_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/senfu/bert-base-uncased-top-pruned-qqp + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_top_pruned_rte_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_top_pruned_rte_en.md new file mode 100644 index 00000000000000..becb79a990a24a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_top_pruned_rte_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_top_pruned_rte BertForSequenceClassification from senfu +author: John Snow Labs +name: bert_base_uncased_top_pruned_rte +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_top_pruned_rte` is a English model originally trained by senfu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_top_pruned_rte_en_5.5.0_3.0_1727316008223.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_top_pruned_rte_en_5.5.0_3.0_1727316008223.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_top_pruned_rte","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_top_pruned_rte", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_top_pruned_rte| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/senfu/bert-base-uncased-top-pruned-rte \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_top_pruned_rte_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_top_pruned_rte_pipeline_en.md new file mode 100644 index 00000000000000..df219c3d0896be --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_top_pruned_rte_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_top_pruned_rte_pipeline pipeline BertForSequenceClassification from senfu +author: John Snow Labs +name: bert_base_uncased_top_pruned_rte_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_top_pruned_rte_pipeline` is a English model originally trained by senfu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_top_pruned_rte_pipeline_en_5.5.0_3.0_1727316030193.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_top_pruned_rte_pipeline_en_5.5.0_3.0_1727316030193.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_top_pruned_rte_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_top_pruned_rte_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_top_pruned_rte_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/senfu/bert-base-uncased-top-pruned-rte + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_twitter_sentiment_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_twitter_sentiment_en.md new file mode 100644 index 00000000000000..29bf294d418feb --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_twitter_sentiment_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_twitter_sentiment BertForSequenceClassification from tmchan0003 +author: John Snow Labs +name: bert_base_uncased_twitter_sentiment +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_twitter_sentiment` is a English model originally trained by tmchan0003. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_twitter_sentiment_en_5.5.0_3.0_1727337987280.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_twitter_sentiment_en_5.5.0_3.0_1727337987280.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_twitter_sentiment","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_twitter_sentiment", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_twitter_sentiment| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/tmchan0003/bert-base-uncased-twitter-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_twitter_sentiment_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_twitter_sentiment_pipeline_en.md new file mode 100644 index 00000000000000..6f96ceaa046df1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_twitter_sentiment_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_twitter_sentiment_pipeline pipeline BertForSequenceClassification from tmchan0003 +author: John Snow Labs +name: bert_base_uncased_twitter_sentiment_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_twitter_sentiment_pipeline` is a English model originally trained by tmchan0003. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_twitter_sentiment_pipeline_en_5.5.0_3.0_1727338007774.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_twitter_sentiment_pipeline_en_5.5.0_3.0_1727338007774.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_twitter_sentiment_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_twitter_sentiment_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_twitter_sentiment_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/tmchan0003/bert-base-uncased-twitter-sentiment + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_v1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_v1_pipeline_en.md new file mode 100644 index 00000000000000..3739a3daed8bbb --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_v1_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_v1_pipeline pipeline BertForSequenceClassification from echoquery +author: John Snow Labs +name: bert_base_uncased_v1_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_v1_pipeline` is a English model originally trained by echoquery. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_v1_pipeline_en_5.5.0_3.0_1727318551180.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_v1_pipeline_en_5.5.0_3.0_1727318551180.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_v1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_v1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_v1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/echoquery/bert-base-uncased-v1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_winobias_classifieronly_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_winobias_classifieronly_en.md new file mode 100644 index 00000000000000..7d6a32942fd1be --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_winobias_classifieronly_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_uncased_winobias_classifieronly BertForSequenceClassification from asun17904 +author: John Snow Labs +name: bert_base_uncased_winobias_classifieronly +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_winobias_classifieronly` is a English model originally trained by asun17904. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_winobias_classifieronly_en_5.5.0_3.0_1727347196306.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_winobias_classifieronly_en_5.5.0_3.0_1727347196306.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_winobias_classifieronly","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_winobias_classifieronly", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_winobias_classifieronly| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/asun17904/bert-base-uncased_winobias_classifieronly \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_wnli_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_wnli_en.md new file mode 100644 index 00000000000000..415fbe34caf4be --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_wnli_en.md @@ -0,0 +1,98 @@ +--- +layout: model +title: English bert_base_uncased_wnli BertForSequenceClassification from textattack +author: John Snow Labs +name: bert_base_uncased_wnli +date: 2024-09-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_wnli` is a English model originally trained by textattack. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_wnli_en_5.5.0_3.0_1727317922649.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_wnli_en_5.5.0_3.0_1727317922649.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_wnli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_wnli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_wnli| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +References + +https://huggingface.co/textattack/bert-base-uncased-WNLI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_wnli_epochs_10_lr_0_0001_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_wnli_epochs_10_lr_0_0001_pipeline_en.md new file mode 100644 index 00000000000000..5711af26c9f8f9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_wnli_epochs_10_lr_0_0001_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_wnli_epochs_10_lr_0_0001_pipeline pipeline BertForSequenceClassification from prateeky2806 +author: John Snow Labs +name: bert_base_uncased_wnli_epochs_10_lr_0_0001_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_wnli_epochs_10_lr_0_0001_pipeline` is a English model originally trained by prateeky2806. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_wnli_epochs_10_lr_0_0001_pipeline_en_5.5.0_3.0_1727338825632.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_wnli_epochs_10_lr_0_0001_pipeline_en_5.5.0_3.0_1727338825632.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_wnli_epochs_10_lr_0_0001_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_wnli_epochs_10_lr_0_0001_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_wnli_epochs_10_lr_0_0001_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/prateeky2806/bert-base-uncased-wnli-epochs-10-lr-0.0001 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_wnli_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_wnli_pipeline_en.md new file mode 100644 index 00000000000000..3c21ab8ffff10d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_wnli_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_wnli_pipeline pipeline BertForSequenceClassification from yoshitomo-matsubara +author: John Snow Labs +name: bert_base_uncased_wnli_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_wnli_pipeline` is a English model originally trained by yoshitomo-matsubara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_wnli_pipeline_en_5.5.0_3.0_1727317949207.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_wnli_pipeline_en_5.5.0_3.0_1727317949207.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_wnli_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_wnli_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_wnli_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/yoshitomo-matsubara/bert-base-uncased-wnli + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_yelp_review_full_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_yelp_review_full_pipeline_en.md new file mode 100644 index 00000000000000..4a27e9dd7c8ebf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_base_uncased_yelp_review_full_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_uncased_yelp_review_full_pipeline pipeline BertForSequenceClassification from ixaxaar +author: John Snow Labs +name: bert_base_uncased_yelp_review_full_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_yelp_review_full_pipeline` is a English model originally trained by ixaxaar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_yelp_review_full_pipeline_en_5.5.0_3.0_1727317044211.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_yelp_review_full_pipeline_en_5.5.0_3.0_1727317044211.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_yelp_review_full_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_yelp_review_full_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_yelp_review_full_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|410.1 MB| + +## References + +https://huggingface.co/ixaxaar/bert-base-uncased_yelp-review-full + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_emotion_khaldiabderrhmane_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_emotion_khaldiabderrhmane_pipeline_en.md new file mode 100644 index 00000000000000..edac037ca82452 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_emotion_khaldiabderrhmane_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_emotion_khaldiabderrhmane_pipeline pipeline BertForSequenceClassification from KhaldiAbderrhmane +author: John Snow Labs +name: bert_emotion_khaldiabderrhmane_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_emotion_khaldiabderrhmane_pipeline` is a English model originally trained by KhaldiAbderrhmane. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_emotion_khaldiabderrhmane_pipeline_en_5.5.0_3.0_1727356117813.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_emotion_khaldiabderrhmane_pipeline_en_5.5.0_3.0_1727356117813.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_emotion_khaldiabderrhmane_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_emotion_khaldiabderrhmane_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_emotion_khaldiabderrhmane_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/KhaldiAbderrhmane/bert-emotion + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_fine_tuned_cola_pallavi176_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_fine_tuned_cola_pallavi176_en.md new file mode 100644 index 00000000000000..171d9e384104c9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_fine_tuned_cola_pallavi176_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_fine_tuned_cola_pallavi176 BertForSequenceClassification from pallavi176 +author: John Snow Labs +name: bert_fine_tuned_cola_pallavi176 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_fine_tuned_cola_pallavi176` is a English model originally trained by pallavi176. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_fine_tuned_cola_pallavi176_en_5.5.0_3.0_1727315308379.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_fine_tuned_cola_pallavi176_en_5.5.0_3.0_1727315308379.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_fine_tuned_cola_pallavi176","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_fine_tuned_cola_pallavi176", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_fine_tuned_cola_pallavi176| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/pallavi176/bert-fine-tuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_brianchu26_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_brianchu26_pipeline_en.md new file mode 100644 index 00000000000000..6cbe3179836920 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_brianchu26_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_finetuned_brianchu26_pipeline pipeline BertForSequenceClassification from brianchu26 +author: John Snow Labs +name: bert_finetuned_brianchu26_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_brianchu26_pipeline` is a English model originally trained by brianchu26. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_brianchu26_pipeline_en_5.5.0_3.0_1727314947593.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_brianchu26_pipeline_en_5.5.0_3.0_1727314947593.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_finetuned_brianchu26_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_finetuned_brianchu26_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_brianchu26_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/brianchu26/bert_finetuned + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_mrpc_dariodematties_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_mrpc_dariodematties_en.md new file mode 100644 index 00000000000000..e6f2072c14751e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_mrpc_dariodematties_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_finetuned_mrpc_dariodematties BertForSequenceClassification from dariodematties +author: John Snow Labs +name: bert_finetuned_mrpc_dariodematties +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_mrpc_dariodematties` is a English model originally trained by dariodematties. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_mrpc_dariodematties_en_5.5.0_3.0_1727354750763.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_mrpc_dariodematties_en_5.5.0_3.0_1727354750763.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_mrpc_dariodematties","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_mrpc_dariodematties", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_mrpc_dariodematties| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/dariodematties/bert-finetuned-mrpc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_mrpc_jkassemi_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_mrpc_jkassemi_en.md new file mode 100644 index 00000000000000..7efcca0d406716 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_mrpc_jkassemi_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_finetuned_mrpc_jkassemi BertForSequenceClassification from jkassemi +author: John Snow Labs +name: bert_finetuned_mrpc_jkassemi +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_mrpc_jkassemi` is a English model originally trained by jkassemi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_mrpc_jkassemi_en_5.5.0_3.0_1727351492494.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_mrpc_jkassemi_en_5.5.0_3.0_1727351492494.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_mrpc_jkassemi","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_mrpc_jkassemi", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_mrpc_jkassemi| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jkassemi/bert-finetuned-mrpc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_mrpc_simtaewan_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_mrpc_simtaewan_en.md new file mode 100644 index 00000000000000..3be4aa6cefdc00 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_mrpc_simtaewan_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_finetuned_mrpc_simtaewan BertForSequenceClassification from Simtaewan +author: John Snow Labs +name: bert_finetuned_mrpc_simtaewan +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_mrpc_simtaewan` is a English model originally trained by Simtaewan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_mrpc_simtaewan_en_5.5.0_3.0_1727317616444.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_mrpc_simtaewan_en_5.5.0_3.0_1727317616444.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_mrpc_simtaewan","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_mrpc_simtaewan", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_mrpc_simtaewan| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Simtaewan/bert-finetuned-mrpc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_semitic_languages_eval_english_andriidemk_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_semitic_languages_eval_english_andriidemk_en.md new file mode 100644 index 00000000000000..9f8281d2921e21 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_semitic_languages_eval_english_andriidemk_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_finetuned_semitic_languages_eval_english_andriidemk BertForSequenceClassification from AndriiDemk +author: John Snow Labs +name: bert_finetuned_semitic_languages_eval_english_andriidemk +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_semitic_languages_eval_english_andriidemk` is a English model originally trained by AndriiDemk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_semitic_languages_eval_english_andriidemk_en_5.5.0_3.0_1727317099203.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_semitic_languages_eval_english_andriidemk_en_5.5.0_3.0_1727317099203.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_semitic_languages_eval_english_andriidemk","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_semitic_languages_eval_english_andriidemk", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_semitic_languages_eval_english_andriidemk| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/AndriiDemk/bert-finetuned-sem_eval-english \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_semitic_languages_eval_english_balluk_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_semitic_languages_eval_english_balluk_pipeline_en.md new file mode 100644 index 00000000000000..581adf9d7a27c6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_semitic_languages_eval_english_balluk_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_finetuned_semitic_languages_eval_english_balluk_pipeline pipeline BertForSequenceClassification from balluk +author: John Snow Labs +name: bert_finetuned_semitic_languages_eval_english_balluk_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_semitic_languages_eval_english_balluk_pipeline` is a English model originally trained by balluk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_semitic_languages_eval_english_balluk_pipeline_en_5.5.0_3.0_1727310889408.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_semitic_languages_eval_english_balluk_pipeline_en_5.5.0_3.0_1727310889408.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_finetuned_semitic_languages_eval_english_balluk_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_finetuned_semitic_languages_eval_english_balluk_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_semitic_languages_eval_english_balluk_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/balluk/bert-finetuned-sem_eval-english + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_semitic_languages_eval_english_yangel88_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_semitic_languages_eval_english_yangel88_en.md new file mode 100644 index 00000000000000..78b84d60acb1ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_semitic_languages_eval_english_yangel88_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_finetuned_semitic_languages_eval_english_yangel88 BertForSequenceClassification from yangel88 +author: John Snow Labs +name: bert_finetuned_semitic_languages_eval_english_yangel88 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_semitic_languages_eval_english_yangel88` is a English model originally trained by yangel88. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_semitic_languages_eval_english_yangel88_en_5.5.0_3.0_1727318721750.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_semitic_languages_eval_english_yangel88_en_5.5.0_3.0_1727318721750.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_semitic_languages_eval_english_yangel88","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_semitic_languages_eval_english_yangel88", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_semitic_languages_eval_english_yangel88| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/yangel88/bert-finetuned-sem_eval-english \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_seq_cl_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_seq_cl_pipeline_en.md new file mode 100644 index 00000000000000..248de4cbfbc41f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_seq_cl_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_finetuned_seq_cl_pipeline pipeline BertForSequenceClassification from Seogmin +author: John Snow Labs +name: bert_finetuned_seq_cl_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_seq_cl_pipeline` is a English model originally trained by Seogmin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_seq_cl_pipeline_en_5.5.0_3.0_1727341782511.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_seq_cl_pipeline_en_5.5.0_3.0_1727341782511.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_finetuned_seq_cl_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_finetuned_seq_cl_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_seq_cl_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Seogmin/bert-finetuned-seq_cl + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_toxic_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_toxic_en.md new file mode 100644 index 00000000000000..ade9672f245b58 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_toxic_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_finetuned_toxic BertForSequenceClassification from zcamz +author: John Snow Labs +name: bert_finetuned_toxic +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_toxic` is a English model originally trained by zcamz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_toxic_en_5.5.0_3.0_1727367071128.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_toxic_en_5.5.0_3.0_1727367071128.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_toxic","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_toxic", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_toxic| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/zcamz/bert-finetuned-toxic \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_toxic_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_toxic_pipeline_en.md new file mode 100644 index 00000000000000..706c31671269c6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_finetuned_toxic_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_finetuned_toxic_pipeline pipeline BertForSequenceClassification from zcamz +author: John Snow Labs +name: bert_finetuned_toxic_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_toxic_pipeline` is a English model originally trained by zcamz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_toxic_pipeline_en_5.5.0_3.0_1727367092148.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_toxic_pipeline_en_5.5.0_3.0_1727367092148.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_finetuned_toxic_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_finetuned_toxic_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_toxic_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/zcamz/bert-finetuned-toxic + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_finetuning_sentiment_model_3000_samples_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_finetuning_sentiment_model_3000_samples_en.md new file mode 100644 index 00000000000000..b5aaf7536ea929 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_finetuning_sentiment_model_3000_samples_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_finetuning_sentiment_model_3000_samples BertForSequenceClassification from skrh +author: John Snow Labs +name: bert_finetuning_sentiment_model_3000_samples +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuning_sentiment_model_3000_samples` is a English model originally trained by skrh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuning_sentiment_model_3000_samples_en_5.5.0_3.0_1727338226162.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuning_sentiment_model_3000_samples_en_5.5.0_3.0_1727338226162.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuning_sentiment_model_3000_samples","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuning_sentiment_model_3000_samples", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuning_sentiment_model_3000_samples| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/skrh/bert_finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_finetuning_test_junzai_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_finetuning_test_junzai_pipeline_en.md new file mode 100644 index 00000000000000..88f0b6d419fa36 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_finetuning_test_junzai_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_finetuning_test_junzai_pipeline pipeline BertForSequenceClassification from junzai +author: John Snow Labs +name: bert_finetuning_test_junzai_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuning_test_junzai_pipeline` is a English model originally trained by junzai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuning_test_junzai_pipeline_en_5.5.0_3.0_1727353331639.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuning_test_junzai_pipeline_en_5.5.0_3.0_1727353331639.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_finetuning_test_junzai_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_finetuning_test_junzai_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuning_test_junzai_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/junzai/bert_finetuning_test + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_finetuning_test_leslie_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_finetuning_test_leslie_en.md new file mode 100644 index 00000000000000..b3187f83b092c3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_finetuning_test_leslie_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_finetuning_test_leslie BertForSequenceClassification from leslie +author: John Snow Labs +name: bert_finetuning_test_leslie +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuning_test_leslie` is a English model originally trained by leslie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuning_test_leslie_en_5.5.0_3.0_1727347555267.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuning_test_leslie_en_5.5.0_3.0_1727347555267.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuning_test_leslie","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuning_test_leslie", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuning_test_leslie| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/leslie/bert_finetuning_test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_finetuning_test_liangxiaoxiao_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_finetuning_test_liangxiaoxiao_en.md new file mode 100644 index 00000000000000..814e363151bcef --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_finetuning_test_liangxiaoxiao_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_finetuning_test_liangxiaoxiao BertForSequenceClassification from liangxiaoxiao +author: John Snow Labs +name: bert_finetuning_test_liangxiaoxiao +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuning_test_liangxiaoxiao` is a English model originally trained by liangxiaoxiao. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuning_test_liangxiaoxiao_en_5.5.0_3.0_1727315185426.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuning_test_liangxiaoxiao_en_5.5.0_3.0_1727315185426.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuning_test_liangxiaoxiao","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuning_test_liangxiaoxiao", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuning_test_liangxiaoxiao| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/liangxiaoxiao/bert_finetuning_test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_large_cased_snli_model1_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_large_cased_snli_model1_en.md new file mode 100644 index 00000000000000..e9978af81387bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_large_cased_snli_model1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_large_cased_snli_model1 BertForSequenceClassification from varun-v-rao +author: John Snow Labs +name: bert_large_cased_snli_model1 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_cased_snli_model1` is a English model originally trained by varun-v-rao. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_cased_snli_model1_en_5.5.0_3.0_1727353741856.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_cased_snli_model1_en_5.5.0_3.0_1727353741856.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_cased_snli_model1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_cased_snli_model1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_cased_snli_model1| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/varun-v-rao/bert-large-cased-snli-model1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_large_cased_snli_model1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_large_cased_snli_model1_pipeline_en.md new file mode 100644 index 00000000000000..0f1d19efc43b2c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_large_cased_snli_model1_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_large_cased_snli_model1_pipeline pipeline BertForSequenceClassification from varun-v-rao +author: John Snow Labs +name: bert_large_cased_snli_model1_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_cased_snli_model1_pipeline` is a English model originally trained by varun-v-rao. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_cased_snli_model1_pipeline_en_5.5.0_3.0_1727353813930.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_cased_snli_model1_pipeline_en_5.5.0_3.0_1727353813930.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_large_cased_snli_model1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_large_cased_snli_model1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_cased_snli_model1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/varun-v-rao/bert-large-cased-snli-model1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_large_mnli_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_large_mnli_en.md new file mode 100644 index 00000000000000..8feb3f31ea3840 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_large_mnli_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_large_mnli BertForSequenceClassification from Cheng98 +author: John Snow Labs +name: bert_large_mnli +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_mnli` is a English model originally trained by Cheng98. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_mnli_en_5.5.0_3.0_1727355569362.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_mnli_en_5.5.0_3.0_1727355569362.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_mnli","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_mnli", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_mnli| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/Cheng98/bert-large-mnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_large_mnli_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_large_mnli_pipeline_en.md new file mode 100644 index 00000000000000..230560900faa4c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_large_mnli_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_large_mnli_pipeline pipeline BertForSequenceClassification from Cheng98 +author: John Snow Labs +name: bert_large_mnli_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_mnli_pipeline` is a English model originally trained by Cheng98. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_mnli_pipeline_en_5.5.0_3.0_1727355633469.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_mnli_pipeline_en_5.5.0_3.0_1727355633469.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_large_mnli_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_large_mnli_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_mnli_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/Cheng98/bert-large-mnli + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_adult_text_classifier_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_adult_text_classifier_en.md new file mode 100644 index 00000000000000..28d2ed4057ba5a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_adult_text_classifier_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_large_uncased_adult_text_classifier BertForSequenceClassification from lazyghost +author: John Snow Labs +name: bert_large_uncased_adult_text_classifier +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_adult_text_classifier` is a English model originally trained by lazyghost. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_adult_text_classifier_en_5.5.0_3.0_1727361800259.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_adult_text_classifier_en_5.5.0_3.0_1727361800259.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_adult_text_classifier","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_adult_text_classifier", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_adult_text_classifier| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/lazyghost/bert-large-uncased-Adult-Text-Classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_adult_text_classifier_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_adult_text_classifier_pipeline_en.md new file mode 100644 index 00000000000000..ffc44f11ae79ef --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_adult_text_classifier_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_large_uncased_adult_text_classifier_pipeline pipeline BertForSequenceClassification from lazyghost +author: John Snow Labs +name: bert_large_uncased_adult_text_classifier_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_adult_text_classifier_pipeline` is a English model originally trained by lazyghost. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_adult_text_classifier_pipeline_en_5.5.0_3.0_1727361826393.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_adult_text_classifier_pipeline_en_5.5.0_3.0_1727361826393.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_large_uncased_adult_text_classifier_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_large_uncased_adult_text_classifier_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_adult_text_classifier_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/lazyghost/bert-large-uncased-Adult-Text-Classifier + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_cola_int8_indic_languages_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_cola_int8_indic_languages_en.md new file mode 100644 index 00000000000000..7a835d1a4bf015 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_cola_int8_indic_languages_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_large_uncased_cola_int8_indic_languages BertForSequenceClassification from Intel +author: John Snow Labs +name: bert_large_uncased_cola_int8_indic_languages +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_cola_int8_indic_languages` is a English model originally trained by Intel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_cola_int8_indic_languages_en_5.5.0_3.0_1727337786147.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_cola_int8_indic_languages_en_5.5.0_3.0_1727337786147.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_cola_int8_indic_languages","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_cola_int8_indic_languages", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_cola_int8_indic_languages| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/Intel/bert-large-uncased-cola-int8-inc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_deletion_multiclass_complete_final_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_deletion_multiclass_complete_final_en.md new file mode 100644 index 00000000000000..db7db98c6bedc3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_deletion_multiclass_complete_final_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_large_uncased_deletion_multiclass_complete_final BertForSequenceClassification from research-dump +author: John Snow Labs +name: bert_large_uncased_deletion_multiclass_complete_final +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_deletion_multiclass_complete_final` is a English model originally trained by research-dump. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_deletion_multiclass_complete_final_en_5.5.0_3.0_1727352317518.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_deletion_multiclass_complete_final_en_5.5.0_3.0_1727352317518.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_deletion_multiclass_complete_final","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_deletion_multiclass_complete_final", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_deletion_multiclass_complete_final| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/research-dump/bert-large-uncased_deletion_multiclass_complete_final \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_finetuned_filtered_0602_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_finetuned_filtered_0602_en.md new file mode 100644 index 00000000000000..95636a0f1ffe42 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_finetuned_filtered_0602_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_large_uncased_finetuned_filtered_0602 BertForSequenceClassification from YeRyeongLee +author: John Snow Labs +name: bert_large_uncased_finetuned_filtered_0602 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_finetuned_filtered_0602` is a English model originally trained by YeRyeongLee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_finetuned_filtered_0602_en_5.5.0_3.0_1727349928706.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_finetuned_filtered_0602_en_5.5.0_3.0_1727349928706.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_finetuned_filtered_0602","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_finetuned_filtered_0602", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_finetuned_filtered_0602| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/YeRyeongLee/bert-large-uncased-finetuned-filtered-0602 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_hate_offensive_oriya_normal_speech_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_hate_offensive_oriya_normal_speech_en.md new file mode 100644 index 00000000000000..068c13d0141d3a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_hate_offensive_oriya_normal_speech_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_large_uncased_hate_offensive_oriya_normal_speech BertForSequenceClassification from DunnBC22 +author: John Snow Labs +name: bert_large_uncased_hate_offensive_oriya_normal_speech +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_hate_offensive_oriya_normal_speech` is a English model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_hate_offensive_oriya_normal_speech_en_5.5.0_3.0_1727309822312.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_hate_offensive_oriya_normal_speech_en_5.5.0_3.0_1727309822312.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_hate_offensive_oriya_normal_speech","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_hate_offensive_oriya_normal_speech", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_hate_offensive_oriya_normal_speech| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/DunnBC22/bert-large-uncased-Hate_Offensive_or_Normal_Speech \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_hate_offensive_oriya_normal_speech_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_hate_offensive_oriya_normal_speech_pipeline_en.md new file mode 100644 index 00000000000000..030656e605afc8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_hate_offensive_oriya_normal_speech_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_large_uncased_hate_offensive_oriya_normal_speech_pipeline pipeline BertForSequenceClassification from DunnBC22 +author: John Snow Labs +name: bert_large_uncased_hate_offensive_oriya_normal_speech_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_hate_offensive_oriya_normal_speech_pipeline` is a English model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_hate_offensive_oriya_normal_speech_pipeline_en_5.5.0_3.0_1727309887616.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_hate_offensive_oriya_normal_speech_pipeline_en_5.5.0_3.0_1727309887616.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_large_uncased_hate_offensive_oriya_normal_speech_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_large_uncased_hate_offensive_oriya_normal_speech_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_hate_offensive_oriya_normal_speech_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/DunnBC22/bert-large-uncased-Hate_Offensive_or_Normal_Speech + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_sentiment_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_sentiment_pipeline_en.md new file mode 100644 index 00000000000000..49e20410c3c006 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_sentiment_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_large_uncased_sentiment_pipeline pipeline BertForSequenceClassification from rttl-ai +author: John Snow Labs +name: bert_large_uncased_sentiment_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_sentiment_pipeline` is a English model originally trained by rttl-ai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_sentiment_pipeline_en_5.5.0_3.0_1727324233776.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_sentiment_pipeline_en_5.5.0_3.0_1727324233776.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_large_uncased_sentiment_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_large_uncased_sentiment_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_sentiment_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/rttl-ai/bert-large-uncased-sentiment + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_wikistance_policy_v1_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_wikistance_policy_v1_en.md new file mode 100644 index 00000000000000..f2d4e003162d5a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_wikistance_policy_v1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_large_uncased_wikistance_policy_v1 BertForSequenceClassification from research-dump +author: John Snow Labs +name: bert_large_uncased_wikistance_policy_v1 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_wikistance_policy_v1` is a English model originally trained by research-dump. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_wikistance_policy_v1_en_5.5.0_3.0_1727347425589.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_wikistance_policy_v1_en_5.5.0_3.0_1727347425589.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_wikistance_policy_v1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_wikistance_policy_v1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_wikistance_policy_v1| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/research-dump/bert-large-uncased_wikistance_policy_v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_wikistance_policy_v1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_wikistance_policy_v1_pipeline_en.md new file mode 100644 index 00000000000000..388c129db5b737 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_large_uncased_wikistance_policy_v1_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_large_uncased_wikistance_policy_v1_pipeline pipeline BertForSequenceClassification from research-dump +author: John Snow Labs +name: bert_large_uncased_wikistance_policy_v1_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_wikistance_policy_v1_pipeline` is a English model originally trained by research-dump. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_wikistance_policy_v1_pipeline_en_5.5.0_3.0_1727347490311.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_wikistance_policy_v1_pipeline_en_5.5.0_3.0_1727347490311.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_large_uncased_wikistance_policy_v1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_large_uncased_wikistance_policy_v1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_wikistance_policy_v1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/research-dump/bert-large-uncased_wikistance_policy_v1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_mdgender_convai_ternary_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_mdgender_convai_ternary_pipeline_en.md new file mode 100644 index 00000000000000..53635566003706 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_mdgender_convai_ternary_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_mdgender_convai_ternary_pipeline pipeline BertForSequenceClassification from Cameron +author: John Snow Labs +name: bert_mdgender_convai_ternary_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_mdgender_convai_ternary_pipeline` is a English model originally trained by Cameron. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_mdgender_convai_ternary_pipeline_en_5.5.0_3.0_1727367249618.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_mdgender_convai_ternary_pipeline_en_5.5.0_3.0_1727367249618.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_mdgender_convai_ternary_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_mdgender_convai_ternary_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_mdgender_convai_ternary_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/Cameron/BERT-mdgender-convai-ternary + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_mini_sentiment_reward_model_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_mini_sentiment_reward_model_en.md new file mode 100644 index 00000000000000..7b34c44ebbdbb4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_mini_sentiment_reward_model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_mini_sentiment_reward_model BertForSequenceClassification from shahrukhx01 +author: John Snow Labs +name: bert_mini_sentiment_reward_model +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_mini_sentiment_reward_model` is a English model originally trained by shahrukhx01. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_mini_sentiment_reward_model_en_5.5.0_3.0_1727309376452.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_mini_sentiment_reward_model_en_5.5.0_3.0_1727309376452.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_mini_sentiment_reward_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_mini_sentiment_reward_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_mini_sentiment_reward_model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|42.1 MB| + +## References + +https://huggingface.co/shahrukhx01/bert-mini-sentiment-reward-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_mini_sentiment_reward_model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_mini_sentiment_reward_model_pipeline_en.md new file mode 100644 index 00000000000000..405d5f1c596882 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_mini_sentiment_reward_model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_mini_sentiment_reward_model_pipeline pipeline BertForSequenceClassification from shahrukhx01 +author: John Snow Labs +name: bert_mini_sentiment_reward_model_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_mini_sentiment_reward_model_pipeline` is a English model originally trained by shahrukhx01. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_mini_sentiment_reward_model_pipeline_en_5.5.0_3.0_1727309378893.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_mini_sentiment_reward_model_pipeline_en_5.5.0_3.0_1727309378893.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_mini_sentiment_reward_model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_mini_sentiment_reward_model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_mini_sentiment_reward_model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|42.1 MB| + +## References + +https://huggingface.co/shahrukhx01/bert-mini-sentiment-reward-model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_mlc_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_mlc_pipeline_en.md new file mode 100644 index 00000000000000..852883a45b69c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_mlc_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_mlc_pipeline pipeline BertForSequenceClassification from Suyash07 +author: John Snow Labs +name: bert_mlc_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_mlc_pipeline` is a English model originally trained by Suyash07. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_mlc_pipeline_en_5.5.0_3.0_1727341534840.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_mlc_pipeline_en_5.5.0_3.0_1727341534840.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_mlc_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_mlc_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_mlc_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|412.6 MB| + +## References + +https://huggingface.co/Suyash07/BERT_MLC + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_mnli_8000_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_mnli_8000_en.md new file mode 100644 index 00000000000000..e0dd57b2e68049 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_mnli_8000_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_mnli_8000 BertForSequenceClassification from Elkelouizajo +author: John Snow Labs +name: bert_mnli_8000 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_mnli_8000` is a English model originally trained by Elkelouizajo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_mnli_8000_en_5.5.0_3.0_1727354493288.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_mnli_8000_en_5.5.0_3.0_1727354493288.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_mnli_8000","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_mnli_8000", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_mnli_8000| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/Elkelouizajo/bert_mnli_8000 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_mnli_8000_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_mnli_8000_pipeline_en.md new file mode 100644 index 00000000000000..f3427bd0e750c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_mnli_8000_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_mnli_8000_pipeline pipeline BertForSequenceClassification from Elkelouizajo +author: John Snow Labs +name: bert_mnli_8000_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_mnli_8000_pipeline` is a English model originally trained by Elkelouizajo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_mnli_8000_pipeline_en_5.5.0_3.0_1727354556834.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_mnli_8000_pipeline_en_5.5.0_3.0_1727354556834.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_mnli_8000_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_mnli_8000_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_mnli_8000_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/Elkelouizajo/bert_mnli_8000 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_mrpc_trained_dichitha_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_mrpc_trained_dichitha_en.md new file mode 100644 index 00000000000000..154516d3705693 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_mrpc_trained_dichitha_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_mrpc_trained_dichitha BertForSequenceClassification from Dichitha +author: John Snow Labs +name: bert_mrpc_trained_dichitha +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_mrpc_trained_dichitha` is a English model originally trained by Dichitha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_mrpc_trained_dichitha_en_5.5.0_3.0_1727345698008.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_mrpc_trained_dichitha_en_5.5.0_3.0_1727345698008.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_mrpc_trained_dichitha","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_mrpc_trained_dichitha", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_mrpc_trained_dichitha| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Dichitha/bert_mrpc_trained \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_mrpc_trained_dichitha_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_mrpc_trained_dichitha_pipeline_en.md new file mode 100644 index 00000000000000..3cabc595ef2863 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_mrpc_trained_dichitha_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_mrpc_trained_dichitha_pipeline pipeline BertForSequenceClassification from Dichitha +author: John Snow Labs +name: bert_mrpc_trained_dichitha_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_mrpc_trained_dichitha_pipeline` is a English model originally trained by Dichitha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_mrpc_trained_dichitha_pipeline_en_5.5.0_3.0_1727345719128.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_mrpc_trained_dichitha_pipeline_en_5.5.0_3.0_1727345719128.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_mrpc_trained_dichitha_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_mrpc_trained_dichitha_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_mrpc_trained_dichitha_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Dichitha/bert_mrpc_trained + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_multilingual_sdg_classification_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-09-26-bert_multilingual_sdg_classification_pipeline_xx.md new file mode 100644 index 00000000000000..3faffac25aaf78 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_multilingual_sdg_classification_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual bert_multilingual_sdg_classification_pipeline pipeline BertForSequenceClassification from albertmartinez +author: John Snow Labs +name: bert_multilingual_sdg_classification_pipeline +date: 2024-09-26 +tags: [xx, open_source, pipeline, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_multilingual_sdg_classification_pipeline` is a Multilingual model originally trained by albertmartinez. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_multilingual_sdg_classification_pipeline_xx_5.5.0_3.0_1727358056683.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_multilingual_sdg_classification_pipeline_xx_5.5.0_3.0_1727358056683.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_multilingual_sdg_classification_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_multilingual_sdg_classification_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_multilingual_sdg_classification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|627.8 MB| + +## References + +https://huggingface.co/albertmartinez/bert-multilingual-sdg-classification + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_multilingual_sdg_classification_xx.md b/docs/_posts/ahmedlone127/2024-09-26-bert_multilingual_sdg_classification_xx.md new file mode 100644 index 00000000000000..9d4eea689cca0b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_multilingual_sdg_classification_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual bert_multilingual_sdg_classification BertForSequenceClassification from albertmartinez +author: John Snow Labs +name: bert_multilingual_sdg_classification +date: 2024-09-26 +tags: [xx, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_multilingual_sdg_classification` is a Multilingual model originally trained by albertmartinez. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_multilingual_sdg_classification_xx_5.5.0_3.0_1727358023730.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_multilingual_sdg_classification_xx_5.5.0_3.0_1727358023730.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_multilingual_sdg_classification","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_multilingual_sdg_classification", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_multilingual_sdg_classification| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|627.8 MB| + +## References + +https://huggingface.co/albertmartinez/bert-multilingual-sdg-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_paper_classifier_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_paper_classifier_en.md new file mode 100644 index 00000000000000..9ffb476e8033a5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_paper_classifier_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_paper_classifier BertForSequenceClassification from oracat +author: John Snow Labs +name: bert_paper_classifier +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_paper_classifier` is a English model originally trained by oracat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_paper_classifier_en_5.5.0_3.0_1727314482044.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_paper_classifier_en_5.5.0_3.0_1727314482044.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_paper_classifier","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_paper_classifier", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_paper_classifier| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.4 MB| + +## References + +https://huggingface.co/oracat/bert-paper-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_paper_classifier_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_paper_classifier_pipeline_en.md new file mode 100644 index 00000000000000..c9557e7bab693e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_paper_classifier_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_paper_classifier_pipeline pipeline BertForSequenceClassification from oracat +author: John Snow Labs +name: bert_paper_classifier_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_paper_classifier_pipeline` is a English model originally trained by oracat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_paper_classifier_pipeline_en_5.5.0_3.0_1727314503444.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_paper_classifier_pipeline_en_5.5.0_3.0_1727314503444.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_paper_classifier_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_paper_classifier_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_paper_classifier_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|410.4 MB| + +## References + +https://huggingface.co/oracat/bert-paper-classifier + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_playground_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_playground_en.md new file mode 100644 index 00000000000000..da628d21352759 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_playground_en.md @@ -0,0 +1,98 @@ +--- +layout: model +title: English bert_playground BertForSequenceClassification from antoineross +author: John Snow Labs +name: bert_playground +date: 2024-09-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_playground` is a English model originally trained by antoineross. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_playground_en_5.5.0_3.0_1727321490779.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_playground_en_5.5.0_3.0_1727321490779.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_playground","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_playground","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_playground| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +References + +https://huggingface.co/antoineross/bert-playground \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_playground_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_playground_pipeline_en.md new file mode 100644 index 00000000000000..b91c42a74a0d6c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_playground_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_playground_pipeline pipeline BertForSequenceClassification from antoinerossupedu +author: John Snow Labs +name: bert_playground_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_playground_pipeline` is a English model originally trained by antoinerossupedu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_playground_pipeline_en_5.5.0_3.0_1727321511470.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_playground_pipeline_en_5.5.0_3.0_1727321511470.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_playground_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_playground_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_playground_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/antoinerossupedu/bert-playground + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_pol_sentiment_model_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_pol_sentiment_model_en.md new file mode 100644 index 00000000000000..cfafac0cf1139a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_pol_sentiment_model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_pol_sentiment_model BertForSequenceClassification from manuelcastiblan +author: John Snow Labs +name: bert_pol_sentiment_model +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_pol_sentiment_model` is a English model originally trained by manuelcastiblan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_pol_sentiment_model_en_5.5.0_3.0_1727326481568.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_pol_sentiment_model_en_5.5.0_3.0_1727326481568.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_pol_sentiment_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_pol_sentiment_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_pol_sentiment_model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|411.4 MB| + +## References + +https://huggingface.co/manuelcastiblan/BERT-POL-SENTIMENT-MODEL \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_regulatory_text_classification_01_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_regulatory_text_classification_01_en.md new file mode 100644 index 00000000000000..ac0b5a882dd532 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_regulatory_text_classification_01_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_regulatory_text_classification_01 BertForSequenceClassification from yirifiai1 +author: John Snow Labs +name: bert_regulatory_text_classification_01 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_regulatory_text_classification_01` is a English model originally trained by yirifiai1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_regulatory_text_classification_01_en_5.5.0_3.0_1727313063172.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_regulatory_text_classification_01_en_5.5.0_3.0_1727313063172.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_regulatory_text_classification_01","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_regulatory_text_classification_01", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_regulatory_text_classification_01| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/yirifiai1/BERT_Regulatory_Text_Classification_01 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_rte_distilled_cka_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_rte_distilled_cka_en.md new file mode 100644 index 00000000000000..fce830e2f88ab9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_rte_distilled_cka_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_rte_distilled_cka BertForSequenceClassification from Sayan01 +author: John Snow Labs +name: bert_rte_distilled_cka +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_rte_distilled_cka` is a English model originally trained by Sayan01. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_rte_distilled_cka_en_5.5.0_3.0_1727312025621.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_rte_distilled_cka_en_5.5.0_3.0_1727312025621.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_rte_distilled_cka","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_rte_distilled_cka", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_rte_distilled_cka| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|155.1 MB| + +## References + +https://huggingface.co/Sayan01/bert-rte-distilled-cka \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_samoan_gen1_large_defined_summarized_chuvash_0_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_samoan_gen1_large_defined_summarized_chuvash_0_en.md new file mode 100644 index 00000000000000..245b050be8b8e0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_samoan_gen1_large_defined_summarized_chuvash_0_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_samoan_gen1_large_defined_summarized_chuvash_0 BertForSequenceClassification from wiorz +author: John Snow Labs +name: bert_samoan_gen1_large_defined_summarized_chuvash_0 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_samoan_gen1_large_defined_summarized_chuvash_0` is a English model originally trained by wiorz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_samoan_gen1_large_defined_summarized_chuvash_0_en_5.5.0_3.0_1727320805993.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_samoan_gen1_large_defined_summarized_chuvash_0_en_5.5.0_3.0_1727320805993.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_samoan_gen1_large_defined_summarized_chuvash_0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_samoan_gen1_large_defined_summarized_chuvash_0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_samoan_gen1_large_defined_summarized_chuvash_0| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/wiorz/bert_sm_gen1_large_defined_summarized_cv_0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_samoan_gen1_large_defined_summarized_chuvash_0_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_samoan_gen1_large_defined_summarized_chuvash_0_pipeline_en.md new file mode 100644 index 00000000000000..d6cf912118f6a1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_samoan_gen1_large_defined_summarized_chuvash_0_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_samoan_gen1_large_defined_summarized_chuvash_0_pipeline pipeline BertForSequenceClassification from wiorz +author: John Snow Labs +name: bert_samoan_gen1_large_defined_summarized_chuvash_0_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_samoan_gen1_large_defined_summarized_chuvash_0_pipeline` is a English model originally trained by wiorz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_samoan_gen1_large_defined_summarized_chuvash_0_pipeline_en_5.5.0_3.0_1727320827783.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_samoan_gen1_large_defined_summarized_chuvash_0_pipeline_en_5.5.0_3.0_1727320827783.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_samoan_gen1_large_defined_summarized_chuvash_0_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_samoan_gen1_large_defined_summarized_chuvash_0_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_samoan_gen1_large_defined_summarized_chuvash_0_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/wiorz/bert_sm_gen1_large_defined_summarized_cv_0 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_sentiment_trainer_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_sentiment_trainer_en.md new file mode 100644 index 00000000000000..834e5f7eee58cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_sentiment_trainer_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_sentiment_trainer BertForSequenceClassification from Artanis1551 +author: John Snow Labs +name: bert_sentiment_trainer +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sentiment_trainer` is a English model originally trained by Artanis1551. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sentiment_trainer_en_5.5.0_3.0_1727345401384.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sentiment_trainer_en_5.5.0_3.0_1727345401384.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sentiment_trainer","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sentiment_trainer", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sentiment_trainer| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/Artanis1551/bert_sentiment_trainer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_small_phishing_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_small_phishing_en.md new file mode 100644 index 00000000000000..b27dcd09e1916d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_small_phishing_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_small_phishing BertForSequenceClassification from David-Egea +author: John Snow Labs +name: bert_small_phishing +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_small_phishing` is a English model originally trained by David-Egea. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_small_phishing_en_5.5.0_3.0_1727357436650.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_small_phishing_en_5.5.0_3.0_1727357436650.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_small_phishing","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_small_phishing", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_small_phishing| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|108.0 MB| + +## References + +https://huggingface.co/David-Egea/bert-small-phishing \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_small_phishing_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_small_phishing_pipeline_en.md new file mode 100644 index 00000000000000..1fc2e8769b9a98 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_small_phishing_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_small_phishing_pipeline pipeline BertForSequenceClassification from David-Egea +author: John Snow Labs +name: bert_small_phishing_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_small_phishing_pipeline` is a English model originally trained by David-Egea. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_small_phishing_pipeline_en_5.5.0_3.0_1727357441982.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_small_phishing_pipeline_en_5.5.0_3.0_1727357441982.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_small_phishing_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_small_phishing_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_small_phishing_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|108.0 MB| + +## References + +https://huggingface.co/David-Egea/bert-small-phishing + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_spam_detection_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_spam_detection_en.md new file mode 100644 index 00000000000000..ca4233702861d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_spam_detection_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_spam_detection BertForSequenceClassification from surajkarki +author: John Snow Labs +name: bert_spam_detection +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_spam_detection` is a English model originally trained by surajkarki. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_spam_detection_en_5.5.0_3.0_1727328569122.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_spam_detection_en_5.5.0_3.0_1727328569122.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_spam_detection","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_spam_detection", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_spam_detection| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/surajkarki/bert_spam_detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_spam_detection_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_spam_detection_pipeline_en.md new file mode 100644 index 00000000000000..efec9ebad8ebbe --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_spam_detection_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_spam_detection_pipeline pipeline BertForSequenceClassification from surajkarki +author: John Snow Labs +name: bert_spam_detection_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_spam_detection_pipeline` is a English model originally trained by surajkarki. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_spam_detection_pipeline_en_5.5.0_3.0_1727328590586.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_spam_detection_pipeline_en_5.5.0_3.0_1727328590586.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_spam_detection_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_spam_detection_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_spam_detection_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/surajkarki/bert_spam_detection + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_sst2_padding0model_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_sst2_padding0model_en.md new file mode 100644 index 00000000000000..62e76e716d64df --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_sst2_padding0model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_sst2_padding0model BertForSequenceClassification from Realgon +author: John Snow Labs +name: bert_sst2_padding0model +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sst2_padding0model` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sst2_padding0model_en_5.5.0_3.0_1727347333438.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sst2_padding0model_en_5.5.0_3.0_1727347333438.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sst2_padding0model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sst2_padding0model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sst2_padding0model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Realgon/bert_sst2_padding0model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_sst2_padding60model_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_sst2_padding60model_en.md new file mode 100644 index 00000000000000..5e2bd86bb7f135 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_sst2_padding60model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_sst2_padding60model BertForSequenceClassification from Realgon +author: John Snow Labs +name: bert_sst2_padding60model +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sst2_padding60model` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sst2_padding60model_en_5.5.0_3.0_1727320526476.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sst2_padding60model_en_5.5.0_3.0_1727320526476.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sst2_padding60model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sst2_padding60model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sst2_padding60model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Realgon/bert_sst2_padding60model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_sst2_padding60model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_sst2_padding60model_pipeline_en.md new file mode 100644 index 00000000000000..bfb9f7c362fc2d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_sst2_padding60model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_sst2_padding60model_pipeline pipeline BertForSequenceClassification from Realgon +author: John Snow Labs +name: bert_sst2_padding60model_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sst2_padding60model_pipeline` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sst2_padding60model_pipeline_en_5.5.0_3.0_1727320547351.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sst2_padding60model_pipeline_en_5.5.0_3.0_1727320547351.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_sst2_padding60model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_sst2_padding60model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sst2_padding60model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Realgon/bert_sst2_padding60model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_sst5_padding100model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_sst5_padding100model_pipeline_en.md new file mode 100644 index 00000000000000..b0c6eed39e8127 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_sst5_padding100model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_sst5_padding100model_pipeline pipeline BertForSequenceClassification from Realgon +author: John Snow Labs +name: bert_sst5_padding100model_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sst5_padding100model_pipeline` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sst5_padding100model_pipeline_en_5.5.0_3.0_1727322581010.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sst5_padding100model_pipeline_en_5.5.0_3.0_1727322581010.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_sst5_padding100model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_sst5_padding100model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sst5_padding100model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Realgon/bert_sst5_padding100model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_sst5_padding20model_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_sst5_padding20model_en.md new file mode 100644 index 00000000000000..bf5f0d5cc1c4f4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_sst5_padding20model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_sst5_padding20model BertForSequenceClassification from Realgon +author: John Snow Labs +name: bert_sst5_padding20model +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sst5_padding20model` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sst5_padding20model_en_5.5.0_3.0_1727319336909.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sst5_padding20model_en_5.5.0_3.0_1727319336909.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sst5_padding20model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sst5_padding20model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sst5_padding20model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Realgon/bert_sst5_padding20model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_sst5_padding20model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_sst5_padding20model_pipeline_en.md new file mode 100644 index 00000000000000..560bc092add28c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_sst5_padding20model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_sst5_padding20model_pipeline pipeline BertForSequenceClassification from Realgon +author: John Snow Labs +name: bert_sst5_padding20model_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sst5_padding20model_pipeline` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sst5_padding20model_pipeline_en_5.5.0_3.0_1727319360082.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sst5_padding20model_pipeline_en_5.5.0_3.0_1727319360082.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_sst5_padding20model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_sst5_padding20model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sst5_padding20model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Realgon/bert_sst5_padding20model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_stsb_distilled_cka_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_stsb_distilled_cka_en.md new file mode 100644 index 00000000000000..490e1a201e1002 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_stsb_distilled_cka_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_stsb_distilled_cka BertForSequenceClassification from Sayan01 +author: John Snow Labs +name: bert_stsb_distilled_cka +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_stsb_distilled_cka` is a English model originally trained by Sayan01. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_stsb_distilled_cka_en_5.5.0_3.0_1727311533450.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_stsb_distilled_cka_en_5.5.0_3.0_1727311533450.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_stsb_distilled_cka","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_stsb_distilled_cka", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_stsb_distilled_cka| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.2 MB| + +## References + +https://huggingface.co/Sayan01/bert-stsb-distilled-cka \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_stsb_distilled_cka_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_stsb_distilled_cka_pipeline_en.md new file mode 100644 index 00000000000000..89e38478aec8af --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_stsb_distilled_cka_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_stsb_distilled_cka_pipeline pipeline BertForSequenceClassification from Sayan01 +author: John Snow Labs +name: bert_stsb_distilled_cka_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_stsb_distilled_cka_pipeline` is a English model originally trained by Sayan01. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_stsb_distilled_cka_pipeline_en_5.5.0_3.0_1727311547044.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_stsb_distilled_cka_pipeline_en_5.5.0_3.0_1727311547044.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_stsb_distilled_cka_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_stsb_distilled_cka_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_stsb_distilled_cka_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.2 MB| + +## References + +https://huggingface.co/Sayan01/bert-stsb-distilled-cka + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_test_benj3037_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_test_benj3037_pipeline_en.md new file mode 100644 index 00000000000000..29c66b6e4b2447 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_test_benj3037_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_test_benj3037_pipeline pipeline BertForSequenceClassification from benj3037 +author: John Snow Labs +name: bert_test_benj3037_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_test_benj3037_pipeline` is a English model originally trained by benj3037. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_test_benj3037_pipeline_en_5.5.0_3.0_1727317802187.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_test_benj3037_pipeline_en_5.5.0_3.0_1727317802187.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_test_benj3037_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_test_benj3037_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_test_benj3037_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/benj3037/bert_test + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_tiny_book_text_classifier_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_tiny_book_text_classifier_en.md new file mode 100644 index 00000000000000..217fc8db2750db --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_tiny_book_text_classifier_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_tiny_book_text_classifier BertForSequenceClassification from shhossain +author: John Snow Labs +name: bert_tiny_book_text_classifier +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tiny_book_text_classifier` is a English model originally trained by shhossain. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tiny_book_text_classifier_en_5.5.0_3.0_1727344121611.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tiny_book_text_classifier_en_5.5.0_3.0_1727344121611.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_tiny_book_text_classifier","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_tiny_book_text_classifier", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tiny_book_text_classifier| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/shhossain/bert-tiny-book-text-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_tiny_book_text_classifier_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_tiny_book_text_classifier_pipeline_en.md new file mode 100644 index 00000000000000..49f46be336d015 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_tiny_book_text_classifier_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_tiny_book_text_classifier_pipeline pipeline BertForSequenceClassification from shhossain +author: John Snow Labs +name: bert_tiny_book_text_classifier_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tiny_book_text_classifier_pipeline` is a English model originally trained by shhossain. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tiny_book_text_classifier_pipeline_en_5.5.0_3.0_1727344122786.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tiny_book_text_classifier_pipeline_en_5.5.0_3.0_1727344122786.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_tiny_book_text_classifier_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_tiny_book_text_classifier_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tiny_book_text_classifier_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|16.8 MB| + +## References + +https://huggingface.co/shhossain/bert-tiny-book-text-classifier + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_tiny_cognitive_bias_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_tiny_cognitive_bias_en.md new file mode 100644 index 00000000000000..12be14fb17552d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_tiny_cognitive_bias_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_tiny_cognitive_bias BertForSequenceClassification from amedvedev +author: John Snow Labs +name: bert_tiny_cognitive_bias +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tiny_cognitive_bias` is a English model originally trained by amedvedev. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tiny_cognitive_bias_en_5.5.0_3.0_1727337395775.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tiny_cognitive_bias_en_5.5.0_3.0_1727337395775.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_tiny_cognitive_bias","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_tiny_cognitive_bias", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tiny_cognitive_bias| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/amedvedev/bert-tiny-cognitive-bias \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_tiny_massive_intent_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_tiny_massive_intent_en.md new file mode 100644 index 00000000000000..f0af261bb9f10e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_tiny_massive_intent_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_tiny_massive_intent BertForSequenceClassification from gokuls +author: John Snow Labs +name: bert_tiny_massive_intent +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tiny_massive_intent` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tiny_massive_intent_en_5.5.0_3.0_1727347965605.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tiny_massive_intent_en_5.5.0_3.0_1727347965605.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_tiny_massive_intent","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_tiny_massive_intent", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tiny_massive_intent| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.8 MB| + +## References + +https://huggingface.co/gokuls/BERT-tiny-Massive-intent \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_tiny_massive_intent_kd_bert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_tiny_massive_intent_kd_bert_pipeline_en.md new file mode 100644 index 00000000000000..09860fc195b5e1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_tiny_massive_intent_kd_bert_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_tiny_massive_intent_kd_bert_pipeline pipeline BertForSequenceClassification from gokuls +author: John Snow Labs +name: bert_tiny_massive_intent_kd_bert_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tiny_massive_intent_kd_bert_pipeline` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tiny_massive_intent_kd_bert_pipeline_en_5.5.0_3.0_1727359972119.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tiny_massive_intent_kd_bert_pipeline_en_5.5.0_3.0_1727359972119.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_tiny_massive_intent_kd_bert_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_tiny_massive_intent_kd_bert_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tiny_massive_intent_kd_bert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|16.8 MB| + +## References + +https://huggingface.co/gokuls/bert-tiny-Massive-intent-KD-BERT + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_tiny_sst2_kd_bert_and_distilbert_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_tiny_sst2_kd_bert_and_distilbert_en.md new file mode 100644 index 00000000000000..68d01e750b5353 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_tiny_sst2_kd_bert_and_distilbert_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_tiny_sst2_kd_bert_and_distilbert BertForSequenceClassification from gokuls +author: John Snow Labs +name: bert_tiny_sst2_kd_bert_and_distilbert +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tiny_sst2_kd_bert_and_distilbert` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tiny_sst2_kd_bert_and_distilbert_en_5.5.0_3.0_1727312469717.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tiny_sst2_kd_bert_and_distilbert_en_5.5.0_3.0_1727312469717.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_tiny_sst2_kd_bert_and_distilbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_tiny_sst2_kd_bert_and_distilbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tiny_sst2_kd_bert_and_distilbert| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/gokuls/bert-tiny-sst2-KD-BERT_and_distilBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_tiny_sst2_kd_bert_and_distilbert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_tiny_sst2_kd_bert_and_distilbert_pipeline_en.md new file mode 100644 index 00000000000000..88141a7f025b8b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_tiny_sst2_kd_bert_and_distilbert_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_tiny_sst2_kd_bert_and_distilbert_pipeline pipeline BertForSequenceClassification from gokuls +author: John Snow Labs +name: bert_tiny_sst2_kd_bert_and_distilbert_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tiny_sst2_kd_bert_and_distilbert_pipeline` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tiny_sst2_kd_bert_and_distilbert_pipeline_en_5.5.0_3.0_1727312471037.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tiny_sst2_kd_bert_and_distilbert_pipeline_en_5.5.0_3.0_1727312471037.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_tiny_sst2_kd_bert_and_distilbert_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_tiny_sst2_kd_bert_and_distilbert_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tiny_sst2_kd_bert_and_distilbert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|16.8 MB| + +## References + +https://huggingface.co/gokuls/bert-tiny-sst2-KD-BERT_and_distilBERT + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_tiny_sst2_kd_bert_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_tiny_sst2_kd_bert_en.md new file mode 100644 index 00000000000000..65a6fe8fb8ae18 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_tiny_sst2_kd_bert_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_tiny_sst2_kd_bert BertForSequenceClassification from gokuls +author: John Snow Labs +name: bert_tiny_sst2_kd_bert +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tiny_sst2_kd_bert` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tiny_sst2_kd_bert_en_5.5.0_3.0_1727320985958.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tiny_sst2_kd_bert_en_5.5.0_3.0_1727320985958.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_tiny_sst2_kd_bert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_tiny_sst2_kd_bert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tiny_sst2_kd_bert| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/gokuls/bert-tiny-sst2-KD-BERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_turkish_turkish_hotel_reviews_tr.md b/docs/_posts/ahmedlone127/2024-09-26-bert_turkish_turkish_hotel_reviews_tr.md new file mode 100644 index 00000000000000..83168209d76aba --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_turkish_turkish_hotel_reviews_tr.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Turkish bert_turkish_turkish_hotel_reviews BertForSequenceClassification from anilguven +author: John Snow Labs +name: bert_turkish_turkish_hotel_reviews +date: 2024-09-26 +tags: [tr, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: tr +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_turkish_turkish_hotel_reviews` is a Turkish model originally trained by anilguven. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_turkish_turkish_hotel_reviews_tr_5.5.0_3.0_1727315793841.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_turkish_turkish_hotel_reviews_tr_5.5.0_3.0_1727315793841.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_turkish_turkish_hotel_reviews","tr") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_turkish_turkish_hotel_reviews", "tr") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_turkish_turkish_hotel_reviews| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|tr| +|Size:|414.5 MB| + +## References + +https://huggingface.co/anilguven/bert_tr_turkish_hotel_reviews \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_twitter_portuguese_icelandic_unemployed_pt.md b/docs/_posts/ahmedlone127/2024-09-26-bert_twitter_portuguese_icelandic_unemployed_pt.md new file mode 100644 index 00000000000000..7fc444a31bd05f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_twitter_portuguese_icelandic_unemployed_pt.md @@ -0,0 +1,98 @@ +--- +layout: model +title: Portuguese bert_twitter_portuguese_icelandic_unemployed BertForSequenceClassification from manueltonneau +author: John Snow Labs +name: bert_twitter_portuguese_icelandic_unemployed +date: 2024-09-26 +tags: [bert, pt, open_source, sequence_classification, onnx] +task: Text Classification +language: pt +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_twitter_portuguese_icelandic_unemployed` is a Portuguese model originally trained by manueltonneau. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_twitter_portuguese_icelandic_unemployed_pt_5.5.0_3.0_1727340575994.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_twitter_portuguese_icelandic_unemployed_pt_5.5.0_3.0_1727340575994.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_twitter_portuguese_icelandic_unemployed","pt")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_twitter_portuguese_icelandic_unemployed","pt") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_twitter_portuguese_icelandic_unemployed| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|pt| +|Size:|408.2 MB| + +## References + +References + +https://huggingface.co/manueltonneau/bert-twitter-pt-is-unemployed \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_twitter_portuguese_job_offer_pt.md b/docs/_posts/ahmedlone127/2024-09-26-bert_twitter_portuguese_job_offer_pt.md new file mode 100644 index 00000000000000..8b7bec408c1485 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_twitter_portuguese_job_offer_pt.md @@ -0,0 +1,98 @@ +--- +layout: model +title: Portuguese bert_twitter_portuguese_job_offer BertForSequenceClassification from manueltonneau +author: John Snow Labs +name: bert_twitter_portuguese_job_offer +date: 2024-09-26 +tags: [bert, pt, open_source, sequence_classification, onnx] +task: Text Classification +language: pt +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_twitter_portuguese_job_offer` is a Portuguese model originally trained by manueltonneau. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_twitter_portuguese_job_offer_pt_5.5.0_3.0_1727369610609.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_twitter_portuguese_job_offer_pt_5.5.0_3.0_1727369610609.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_twitter_portuguese_job_offer","pt")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_twitter_portuguese_job_offer","pt") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_twitter_portuguese_job_offer| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|pt| +|Size:|408.2 MB| + +## References + +References + +https://huggingface.co/manueltonneau/bert-twitter-pt-job-offer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_twitterfin_padding70model_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_twitterfin_padding70model_en.md new file mode 100644 index 00000000000000..e103a1b2c6dd15 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_twitterfin_padding70model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_twitterfin_padding70model BertForSequenceClassification from Realgon +author: John Snow Labs +name: bert_twitterfin_padding70model +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_twitterfin_padding70model` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_twitterfin_padding70model_en_5.5.0_3.0_1727351959084.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_twitterfin_padding70model_en_5.5.0_3.0_1727351959084.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_twitterfin_padding70model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_twitterfin_padding70model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_twitterfin_padding70model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Realgon/bert_twitterfin_padding70model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_twitterfin_padding70model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_twitterfin_padding70model_pipeline_en.md new file mode 100644 index 00000000000000..6f972054ac8921 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_twitterfin_padding70model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_twitterfin_padding70model_pipeline pipeline BertForSequenceClassification from Realgon +author: John Snow Labs +name: bert_twitterfin_padding70model_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_twitterfin_padding70model_pipeline` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_twitterfin_padding70model_pipeline_en_5.5.0_3.0_1727351983985.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_twitterfin_padding70model_pipeline_en_5.5.0_3.0_1727351983985.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_twitterfin_padding70model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_twitterfin_padding70model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_twitterfin_padding70model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Realgon/bert_twitterfin_padding70model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_twitterfin_padding80model_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_twitterfin_padding80model_en.md new file mode 100644 index 00000000000000..bb3e8b8675d63f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_twitterfin_padding80model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_twitterfin_padding80model BertForSequenceClassification from Realgon +author: John Snow Labs +name: bert_twitterfin_padding80model +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_twitterfin_padding80model` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_twitterfin_padding80model_en_5.5.0_3.0_1727318242906.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_twitterfin_padding80model_en_5.5.0_3.0_1727318242906.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_twitterfin_padding80model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_twitterfin_padding80model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_twitterfin_padding80model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Realgon/bert_twitterfin_padding80model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_twitterfin_padding80model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_twitterfin_padding80model_pipeline_en.md new file mode 100644 index 00000000000000..52f91bb26acb40 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_twitterfin_padding80model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_twitterfin_padding80model_pipeline pipeline BertForSequenceClassification from Realgon +author: John Snow Labs +name: bert_twitterfin_padding80model_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_twitterfin_padding80model_pipeline` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_twitterfin_padding80model_pipeline_en_5.5.0_3.0_1727318267254.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_twitterfin_padding80model_pipeline_en_5.5.0_3.0_1727318267254.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_twitterfin_padding80model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_twitterfin_padding80model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_twitterfin_padding80model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Realgon/bert_twitterfin_padding80model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_uncased_l_8_h_256_a_4_emotion_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_uncased_l_8_h_256_a_4_emotion_en.md new file mode 100644 index 00000000000000..d88b9d93c1017c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_uncased_l_8_h_256_a_4_emotion_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_uncased_l_8_h_256_a_4_emotion BertForSequenceClassification from gokuls +author: John Snow Labs +name: bert_uncased_l_8_h_256_a_4_emotion +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_uncased_l_8_h_256_a_4_emotion` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_uncased_l_8_h_256_a_4_emotion_en_5.5.0_3.0_1727314367699.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_uncased_l_8_h_256_a_4_emotion_en_5.5.0_3.0_1727314367699.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_uncased_l_8_h_256_a_4_emotion","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_uncased_l_8_h_256_a_4_emotion", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_uncased_l_8_h_256_a_4_emotion| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|53.9 MB| + +## References + +https://huggingface.co/gokuls/bert_uncased_L-8_H-256_A-4_emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_uncased_l_8_h_256_a_4_emotion_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_uncased_l_8_h_256_a_4_emotion_pipeline_en.md new file mode 100644 index 00000000000000..997224ed1614db --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_uncased_l_8_h_256_a_4_emotion_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_uncased_l_8_h_256_a_4_emotion_pipeline pipeline BertForSequenceClassification from gokuls +author: John Snow Labs +name: bert_uncased_l_8_h_256_a_4_emotion_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_uncased_l_8_h_256_a_4_emotion_pipeline` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_uncased_l_8_h_256_a_4_emotion_pipeline_en_5.5.0_3.0_1727314370521.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_uncased_l_8_h_256_a_4_emotion_pipeline_en_5.5.0_3.0_1727314370521.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_uncased_l_8_h_256_a_4_emotion_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_uncased_l_8_h_256_a_4_emotion_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_uncased_l_8_h_256_a_4_emotion_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|54.0 MB| + +## References + +https://huggingface.co/gokuls/bert_uncased_L-8_H-256_A-4_emotion + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_with_preprocessing_grid_search_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_with_preprocessing_grid_search_en.md new file mode 100644 index 00000000000000..60a374dcfbf8fa --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_with_preprocessing_grid_search_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_with_preprocessing_grid_search BertForSequenceClassification from LovenOO +author: John Snow Labs +name: bert_with_preprocessing_grid_search +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_with_preprocessing_grid_search` is a English model originally trained by LovenOO. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_with_preprocessing_grid_search_en_5.5.0_3.0_1727340349423.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_with_preprocessing_grid_search_en_5.5.0_3.0_1727340349423.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_with_preprocessing_grid_search","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_with_preprocessing_grid_search", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_with_preprocessing_grid_search| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/LovenOO/BERT_with_preprocessing_grid_search \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_with_preprocessing_grid_search_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_with_preprocessing_grid_search_pipeline_en.md new file mode 100644 index 00000000000000..837c85538cecaa --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_with_preprocessing_grid_search_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_with_preprocessing_grid_search_pipeline pipeline BertForSequenceClassification from LovenOO +author: John Snow Labs +name: bert_with_preprocessing_grid_search_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_with_preprocessing_grid_search_pipeline` is a English model originally trained by LovenOO. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_with_preprocessing_grid_search_pipeline_en_5.5.0_3.0_1727340370175.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_with_preprocessing_grid_search_pipeline_en_5.5.0_3.0_1727340370175.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_with_preprocessing_grid_search_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_with_preprocessing_grid_search_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_with_preprocessing_grid_search_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/LovenOO/BERT_with_preprocessing_grid_search + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bert_without_preprocessing_grid_search_en.md b/docs/_posts/ahmedlone127/2024-09-26-bert_without_preprocessing_grid_search_en.md new file mode 100644 index 00000000000000..dfc5076fa3dd57 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bert_without_preprocessing_grid_search_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_without_preprocessing_grid_search BertForSequenceClassification from LovenOO +author: John Snow Labs +name: bert_without_preprocessing_grid_search +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_without_preprocessing_grid_search` is a English model originally trained by LovenOO. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_without_preprocessing_grid_search_en_5.5.0_3.0_1727346962603.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_without_preprocessing_grid_search_en_5.5.0_3.0_1727346962603.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_without_preprocessing_grid_search","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_without_preprocessing_grid_search", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_without_preprocessing_grid_search| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/LovenOO/BERT_without_preprocessing_grid_search \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bertsmallclassifier_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bertsmallclassifier_pipeline_en.md new file mode 100644 index 00000000000000..1a0f7c8201dfe9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bertsmallclassifier_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bertsmallclassifier_pipeline pipeline BertForSequenceClassification from meetplace1 +author: John Snow Labs +name: bertsmallclassifier_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bertsmallclassifier_pipeline` is a English model originally trained by meetplace1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bertsmallclassifier_pipeline_en_5.5.0_3.0_1727368693835.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bertsmallclassifier_pipeline_en_5.5.0_3.0_1727368693835.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bertsmallclassifier_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bertsmallclassifier_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bertsmallclassifier_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/meetplace1/bertsmallclassifier + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-berttiny_hate_speech18_bothpretrained_en.md b/docs/_posts/ahmedlone127/2024-09-26-berttiny_hate_speech18_bothpretrained_en.md new file mode 100644 index 00000000000000..e542a710233473 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-berttiny_hate_speech18_bothpretrained_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English berttiny_hate_speech18_bothpretrained BertForSequenceClassification from joseph10 +author: John Snow Labs +name: berttiny_hate_speech18_bothpretrained +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`berttiny_hate_speech18_bothpretrained` is a English model originally trained by joseph10. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/berttiny_hate_speech18_bothpretrained_en_5.5.0_3.0_1727328238299.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/berttiny_hate_speech18_bothpretrained_en_5.5.0_3.0_1727328238299.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("berttiny_hate_speech18_bothpretrained","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("berttiny_hate_speech18_bothpretrained", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|berttiny_hate_speech18_bothpretrained| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/joseph10/berttiny-hate_speech18-bothpretrained \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-berttiny_hate_speech18_bothpretrained_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-berttiny_hate_speech18_bothpretrained_pipeline_en.md new file mode 100644 index 00000000000000..f578d5bf22e83c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-berttiny_hate_speech18_bothpretrained_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English berttiny_hate_speech18_bothpretrained_pipeline pipeline BertForSequenceClassification from joseph10 +author: John Snow Labs +name: berttiny_hate_speech18_bothpretrained_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`berttiny_hate_speech18_bothpretrained_pipeline` is a English model originally trained by joseph10. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/berttiny_hate_speech18_bothpretrained_pipeline_en_5.5.0_3.0_1727328239446.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/berttiny_hate_speech18_bothpretrained_pipeline_en_5.5.0_3.0_1727328239446.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("berttiny_hate_speech18_bothpretrained_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("berttiny_hate_speech18_bothpretrained_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|berttiny_hate_speech18_bothpretrained_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|16.8 MB| + +## References + +https://huggingface.co/joseph10/berttiny-hate_speech18-bothpretrained + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-berttiny_hatexplain_parentpretrained_en.md b/docs/_posts/ahmedlone127/2024-09-26-berttiny_hatexplain_parentpretrained_en.md new file mode 100644 index 00000000000000..62b9df5f614258 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-berttiny_hatexplain_parentpretrained_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English berttiny_hatexplain_parentpretrained BertForSequenceClassification from joseph10 +author: John Snow Labs +name: berttiny_hatexplain_parentpretrained +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`berttiny_hatexplain_parentpretrained` is a English model originally trained by joseph10. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/berttiny_hatexplain_parentpretrained_en_5.5.0_3.0_1727317769426.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/berttiny_hatexplain_parentpretrained_en_5.5.0_3.0_1727317769426.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("berttiny_hatexplain_parentpretrained","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("berttiny_hatexplain_parentpretrained", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|berttiny_hatexplain_parentpretrained| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/joseph10/berttiny-hateXplain-parentpretrained \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-berttiny_hatexplain_parentpretrained_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-berttiny_hatexplain_parentpretrained_pipeline_en.md new file mode 100644 index 00000000000000..6c9841272239d2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-berttiny_hatexplain_parentpretrained_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English berttiny_hatexplain_parentpretrained_pipeline pipeline BertForSequenceClassification from joseph10 +author: John Snow Labs +name: berttiny_hatexplain_parentpretrained_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`berttiny_hatexplain_parentpretrained_pipeline` is a English model originally trained by joseph10. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/berttiny_hatexplain_parentpretrained_pipeline_en_5.5.0_3.0_1727317772918.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/berttiny_hatexplain_parentpretrained_pipeline_en_5.5.0_3.0_1727317772918.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("berttiny_hatexplain_parentpretrained_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("berttiny_hatexplain_parentpretrained_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|berttiny_hatexplain_parentpretrained_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|16.8 MB| + +## References + +https://huggingface.co/joseph10/berttiny-hateXplain-parentpretrained + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-best_model_sst_2_32_87_en.md b/docs/_posts/ahmedlone127/2024-09-26-best_model_sst_2_32_87_en.md new file mode 100644 index 00000000000000..36aee926016dbe --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-best_model_sst_2_32_87_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English best_model_sst_2_32_87 BertForSequenceClassification from simonycl +author: John Snow Labs +name: best_model_sst_2_32_87 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`best_model_sst_2_32_87` is a English model originally trained by simonycl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/best_model_sst_2_32_87_en_5.5.0_3.0_1727315682002.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/best_model_sst_2_32_87_en_5.5.0_3.0_1727315682002.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("best_model_sst_2_32_87","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("best_model_sst_2_32_87", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|best_model_sst_2_32_87| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/simonycl/best_model-sst-2-32-87 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-beto_emoji_es.md b/docs/_posts/ahmedlone127/2024-09-26-beto_emoji_es.md new file mode 100644 index 00000000000000..1ecd595d9548a7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-beto_emoji_es.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Castilian, Spanish beto_emoji BertForSequenceClassification from ccarvajal +author: John Snow Labs +name: beto_emoji +date: 2024-09-26 +tags: [es, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: es +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`beto_emoji` is a Castilian, Spanish model originally trained by ccarvajal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/beto_emoji_es_5.5.0_3.0_1727345530161.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/beto_emoji_es_5.5.0_3.0_1727345530161.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("beto_emoji","es") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("beto_emoji", "es") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|beto_emoji| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|411.9 MB| + +## References + +https://huggingface.co/ccarvajal/beto-emoji \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bge_large_chinese_llama3_70_en.md b/docs/_posts/ahmedlone127/2024-09-26-bge_large_chinese_llama3_70_en.md new file mode 100644 index 00000000000000..3cf92d7d434fbe --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bge_large_chinese_llama3_70_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bge_large_chinese_llama3_70 BertForSequenceClassification from Snowkcon +author: John Snow Labs +name: bge_large_chinese_llama3_70 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bge_large_chinese_llama3_70` is a English model originally trained by Snowkcon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bge_large_chinese_llama3_70_en_5.5.0_3.0_1727315099419.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bge_large_chinese_llama3_70_en_5.5.0_3.0_1727315099419.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bge_large_chinese_llama3_70","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bge_large_chinese_llama3_70", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bge_large_chinese_llama3_70| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|768.4 MB| + +## References + +https://huggingface.co/Snowkcon/bge_large_zh_llama3_70 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bge_large_chinese_llama3_70_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-bge_large_chinese_llama3_70_pipeline_en.md new file mode 100644 index 00000000000000..efbdae2e68a6c4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bge_large_chinese_llama3_70_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bge_large_chinese_llama3_70_pipeline pipeline BertForSequenceClassification from Snowkcon +author: John Snow Labs +name: bge_large_chinese_llama3_70_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bge_large_chinese_llama3_70_pipeline` is a English model originally trained by Snowkcon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bge_large_chinese_llama3_70_pipeline_en_5.5.0_3.0_1727315331932.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bge_large_chinese_llama3_70_pipeline_en_5.5.0_3.0_1727315331932.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bge_large_chinese_llama3_70_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bge_large_chinese_llama3_70_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bge_large_chinese_llama3_70_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|768.4 MB| + +## References + +https://huggingface.co/Snowkcon/bge_large_zh_llama3_70 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-binary_inference_bert_base_lr1e_03_wd1e_03_bs32_ep10_plant_fold4_en.md b/docs/_posts/ahmedlone127/2024-09-26-binary_inference_bert_base_lr1e_03_wd1e_03_bs32_ep10_plant_fold4_en.md new file mode 100644 index 00000000000000..b47c933a103eb5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-binary_inference_bert_base_lr1e_03_wd1e_03_bs32_ep10_plant_fold4_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English binary_inference_bert_base_lr1e_03_wd1e_03_bs32_ep10_plant_fold4 BertForSequenceClassification from ys7yoo +author: John Snow Labs +name: binary_inference_bert_base_lr1e_03_wd1e_03_bs32_ep10_plant_fold4 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`binary_inference_bert_base_lr1e_03_wd1e_03_bs32_ep10_plant_fold4` is a English model originally trained by ys7yoo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/binary_inference_bert_base_lr1e_03_wd1e_03_bs32_ep10_plant_fold4_en_5.5.0_3.0_1727318060003.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/binary_inference_bert_base_lr1e_03_wd1e_03_bs32_ep10_plant_fold4_en_5.5.0_3.0_1727318060003.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("binary_inference_bert_base_lr1e_03_wd1e_03_bs32_ep10_plant_fold4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("binary_inference_bert_base_lr1e_03_wd1e_03_bs32_ep10_plant_fold4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|binary_inference_bert_base_lr1e_03_wd1e_03_bs32_ep10_plant_fold4| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/ys7yoo/binary-inference_bert-base_lr1e-03_wd1e-03_bs32_ep10_plant_fold4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-binary_inference_bert_base_lr1e_03_wd1e_03_bs32_ep10_plant_fold4_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-binary_inference_bert_base_lr1e_03_wd1e_03_bs32_ep10_plant_fold4_pipeline_en.md new file mode 100644 index 00000000000000..a4f61f4815b423 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-binary_inference_bert_base_lr1e_03_wd1e_03_bs32_ep10_plant_fold4_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English binary_inference_bert_base_lr1e_03_wd1e_03_bs32_ep10_plant_fold4_pipeline pipeline BertForSequenceClassification from ys7yoo +author: John Snow Labs +name: binary_inference_bert_base_lr1e_03_wd1e_03_bs32_ep10_plant_fold4_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`binary_inference_bert_base_lr1e_03_wd1e_03_bs32_ep10_plant_fold4_pipeline` is a English model originally trained by ys7yoo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/binary_inference_bert_base_lr1e_03_wd1e_03_bs32_ep10_plant_fold4_pipeline_en_5.5.0_3.0_1727318081320.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/binary_inference_bert_base_lr1e_03_wd1e_03_bs32_ep10_plant_fold4_pipeline_en_5.5.0_3.0_1727318081320.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("binary_inference_bert_base_lr1e_03_wd1e_03_bs32_ep10_plant_fold4_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("binary_inference_bert_base_lr1e_03_wd1e_03_bs32_ep10_plant_fold4_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|binary_inference_bert_base_lr1e_03_wd1e_03_bs32_ep10_plant_fold4_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/ys7yoo/binary-inference_bert-base_lr1e-03_wd1e-03_bs32_ep10_plant_fold4 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-binary_inference_bert_base_lr5e_06_wd1e_03_bs16_ep10_plant_fold2_en.md b/docs/_posts/ahmedlone127/2024-09-26-binary_inference_bert_base_lr5e_06_wd1e_03_bs16_ep10_plant_fold2_en.md new file mode 100644 index 00000000000000..75de5175856961 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-binary_inference_bert_base_lr5e_06_wd1e_03_bs16_ep10_plant_fold2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English binary_inference_bert_base_lr5e_06_wd1e_03_bs16_ep10_plant_fold2 BertForSequenceClassification from ys7yoo +author: John Snow Labs +name: binary_inference_bert_base_lr5e_06_wd1e_03_bs16_ep10_plant_fold2 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`binary_inference_bert_base_lr5e_06_wd1e_03_bs16_ep10_plant_fold2` is a English model originally trained by ys7yoo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/binary_inference_bert_base_lr5e_06_wd1e_03_bs16_ep10_plant_fold2_en_5.5.0_3.0_1727320509423.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/binary_inference_bert_base_lr5e_06_wd1e_03_bs16_ep10_plant_fold2_en_5.5.0_3.0_1727320509423.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("binary_inference_bert_base_lr5e_06_wd1e_03_bs16_ep10_plant_fold2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("binary_inference_bert_base_lr5e_06_wd1e_03_bs16_ep10_plant_fold2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|binary_inference_bert_base_lr5e_06_wd1e_03_bs16_ep10_plant_fold2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/ys7yoo/binary-inference_bert-base_lr5e-06_wd1e-03_bs16_ep10_plant_fold2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-binary_inference_bert_base_lr5e_06_wd1e_03_bs16_ep10_plant_fold2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-binary_inference_bert_base_lr5e_06_wd1e_03_bs16_ep10_plant_fold2_pipeline_en.md new file mode 100644 index 00000000000000..e6424125a903a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-binary_inference_bert_base_lr5e_06_wd1e_03_bs16_ep10_plant_fold2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English binary_inference_bert_base_lr5e_06_wd1e_03_bs16_ep10_plant_fold2_pipeline pipeline BertForSequenceClassification from ys7yoo +author: John Snow Labs +name: binary_inference_bert_base_lr5e_06_wd1e_03_bs16_ep10_plant_fold2_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`binary_inference_bert_base_lr5e_06_wd1e_03_bs16_ep10_plant_fold2_pipeline` is a English model originally trained by ys7yoo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/binary_inference_bert_base_lr5e_06_wd1e_03_bs16_ep10_plant_fold2_pipeline_en_5.5.0_3.0_1727320530687.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/binary_inference_bert_base_lr5e_06_wd1e_03_bs16_ep10_plant_fold2_pipeline_en_5.5.0_3.0_1727320530687.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("binary_inference_bert_base_lr5e_06_wd1e_03_bs16_ep10_plant_fold2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("binary_inference_bert_base_lr5e_06_wd1e_03_bs16_ep10_plant_fold2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|binary_inference_bert_base_lr5e_06_wd1e_03_bs16_ep10_plant_fold2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|414.7 MB| + +## References + +https://huggingface.co/ys7yoo/binary-inference_bert-base_lr5e-06_wd1e-03_bs16_ep10_plant_fold2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-bio_clinicalbert_finetuned_20pc_en.md b/docs/_posts/ahmedlone127/2024-09-26-bio_clinicalbert_finetuned_20pc_en.md new file mode 100644 index 00000000000000..4222d982e4c4e8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-bio_clinicalbert_finetuned_20pc_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bio_clinicalbert_finetuned_20pc BertForSequenceClassification from okho0653 +author: John Snow Labs +name: bio_clinicalbert_finetuned_20pc +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bio_clinicalbert_finetuned_20pc` is a English model originally trained by okho0653. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bio_clinicalbert_finetuned_20pc_en_5.5.0_3.0_1727369373649.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bio_clinicalbert_finetuned_20pc_en_5.5.0_3.0_1727369373649.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("bio_clinicalbert_finetuned_20pc","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bio_clinicalbert_finetuned_20pc", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bio_clinicalbert_finetuned_20pc| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.5 MB| + +## References + +https://huggingface.co/okho0653/Bio_ClinicalBERT-finetuned-20pc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-biobert_medical_abstract_classification_en.md b/docs/_posts/ahmedlone127/2024-09-26-biobert_medical_abstract_classification_en.md new file mode 100644 index 00000000000000..116c1d4c3595fc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-biobert_medical_abstract_classification_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English biobert_medical_abstract_classification BertForSequenceClassification from HarshadKunjir +author: John Snow Labs +name: biobert_medical_abstract_classification +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biobert_medical_abstract_classification` is a English model originally trained by HarshadKunjir. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biobert_medical_abstract_classification_en_5.5.0_3.0_1727343674858.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biobert_medical_abstract_classification_en_5.5.0_3.0_1727343674858.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("biobert_medical_abstract_classification","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("biobert_medical_abstract_classification", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biobert_medical_abstract_classification| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.5 MB| + +## References + +https://huggingface.co/HarshadKunjir/BioBERT_medical_abstract_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-boss_sentiment_3000_bert_base_uncased_en.md b/docs/_posts/ahmedlone127/2024-09-26-boss_sentiment_3000_bert_base_uncased_en.md new file mode 100644 index 00000000000000..c371c91da83aa4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-boss_sentiment_3000_bert_base_uncased_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English boss_sentiment_3000_bert_base_uncased BertForSequenceClassification from Kyle1668 +author: John Snow Labs +name: boss_sentiment_3000_bert_base_uncased +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`boss_sentiment_3000_bert_base_uncased` is a English model originally trained by Kyle1668. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/boss_sentiment_3000_bert_base_uncased_en_5.5.0_3.0_1727309495849.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/boss_sentiment_3000_bert_base_uncased_en_5.5.0_3.0_1727309495849.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("boss_sentiment_3000_bert_base_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("boss_sentiment_3000_bert_base_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|boss_sentiment_3000_bert_base_uncased| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Kyle1668/boss-sentiment-3000-bert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-boss_sentiment_3000_bert_base_uncased_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-boss_sentiment_3000_bert_base_uncased_pipeline_en.md new file mode 100644 index 00000000000000..04d4fd76f94c26 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-boss_sentiment_3000_bert_base_uncased_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English boss_sentiment_3000_bert_base_uncased_pipeline pipeline BertForSequenceClassification from Kyle1668 +author: John Snow Labs +name: boss_sentiment_3000_bert_base_uncased_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`boss_sentiment_3000_bert_base_uncased_pipeline` is a English model originally trained by Kyle1668. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/boss_sentiment_3000_bert_base_uncased_pipeline_en_5.5.0_3.0_1727309517716.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/boss_sentiment_3000_bert_base_uncased_pipeline_en_5.5.0_3.0_1727309517716.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("boss_sentiment_3000_bert_base_uncased_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("boss_sentiment_3000_bert_base_uncased_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|boss_sentiment_3000_bert_base_uncased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Kyle1668/boss-sentiment-3000-bert-base-uncased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-boss_sentiment_6000_bert_base_uncased_en.md b/docs/_posts/ahmedlone127/2024-09-26-boss_sentiment_6000_bert_base_uncased_en.md new file mode 100644 index 00000000000000..b865566cae0025 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-boss_sentiment_6000_bert_base_uncased_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English boss_sentiment_6000_bert_base_uncased BertForSequenceClassification from Kyle1668 +author: John Snow Labs +name: boss_sentiment_6000_bert_base_uncased +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`boss_sentiment_6000_bert_base_uncased` is a English model originally trained by Kyle1668. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/boss_sentiment_6000_bert_base_uncased_en_5.5.0_3.0_1727361064063.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/boss_sentiment_6000_bert_base_uncased_en_5.5.0_3.0_1727361064063.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("boss_sentiment_6000_bert_base_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("boss_sentiment_6000_bert_base_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|boss_sentiment_6000_bert_base_uncased| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Kyle1668/boss-sentiment-6000-bert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-boss_toxicity_12000_bert_base_uncased_en.md b/docs/_posts/ahmedlone127/2024-09-26-boss_toxicity_12000_bert_base_uncased_en.md new file mode 100644 index 00000000000000..4c63e4687c8049 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-boss_toxicity_12000_bert_base_uncased_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English boss_toxicity_12000_bert_base_uncased BertForSequenceClassification from Kyle1668 +author: John Snow Labs +name: boss_toxicity_12000_bert_base_uncased +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`boss_toxicity_12000_bert_base_uncased` is a English model originally trained by Kyle1668. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/boss_toxicity_12000_bert_base_uncased_en_5.5.0_3.0_1727339165706.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/boss_toxicity_12000_bert_base_uncased_en_5.5.0_3.0_1727339165706.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("boss_toxicity_12000_bert_base_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("boss_toxicity_12000_bert_base_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|boss_toxicity_12000_bert_base_uncased| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Kyle1668/boss-toxicity-12000-bert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-boss_toxicity_6000_bert_base_uncased_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-boss_toxicity_6000_bert_base_uncased_pipeline_en.md new file mode 100644 index 00000000000000..4cd307a0a56e7d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-boss_toxicity_6000_bert_base_uncased_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English boss_toxicity_6000_bert_base_uncased_pipeline pipeline BertForSequenceClassification from Kyle1668 +author: John Snow Labs +name: boss_toxicity_6000_bert_base_uncased_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`boss_toxicity_6000_bert_base_uncased_pipeline` is a English model originally trained by Kyle1668. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/boss_toxicity_6000_bert_base_uncased_pipeline_en_5.5.0_3.0_1727370823255.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/boss_toxicity_6000_bert_base_uncased_pipeline_en_5.5.0_3.0_1727370823255.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("boss_toxicity_6000_bert_base_uncased_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("boss_toxicity_6000_bert_base_uncased_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|boss_toxicity_6000_bert_base_uncased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Kyle1668/boss-toxicity-6000-bert-base-uncased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-check_sec_tiny_en.md b/docs/_posts/ahmedlone127/2024-09-26-check_sec_tiny_en.md new file mode 100644 index 00000000000000..017492d7df8f09 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-check_sec_tiny_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English check_sec_tiny BertForSequenceClassification from huolongguo10 +author: John Snow Labs +name: check_sec_tiny +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`check_sec_tiny` is a English model originally trained by huolongguo10. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/check_sec_tiny_en_5.5.0_3.0_1727359205906.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/check_sec_tiny_en_5.5.0_3.0_1727359205906.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("check_sec_tiny","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("check_sec_tiny", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|check_sec_tiny| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/huolongguo10/check_sec_tiny \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-check_sec_tiny_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-check_sec_tiny_pipeline_en.md new file mode 100644 index 00000000000000..425daa1389e4a8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-check_sec_tiny_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English check_sec_tiny_pipeline pipeline BertForSequenceClassification from huolongguo10 +author: John Snow Labs +name: check_sec_tiny_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`check_sec_tiny_pipeline` is a English model originally trained by huolongguo10. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/check_sec_tiny_pipeline_en_5.5.0_3.0_1727359207193.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/check_sec_tiny_pipeline_en_5.5.0_3.0_1727359207193.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("check_sec_tiny_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("check_sec_tiny_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|check_sec_tiny_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|16.8 MB| + +## References + +https://huggingface.co/huolongguo10/check_sec_tiny + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-chinese_roberta_climate_risk_opportunity_prediction_vv2_en.md b/docs/_posts/ahmedlone127/2024-09-26-chinese_roberta_climate_risk_opportunity_prediction_vv2_en.md new file mode 100644 index 00000000000000..fc556bcf62c126 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-chinese_roberta_climate_risk_opportunity_prediction_vv2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English chinese_roberta_climate_risk_opportunity_prediction_vv2 BertForSequenceClassification from hw2942 +author: John Snow Labs +name: chinese_roberta_climate_risk_opportunity_prediction_vv2 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chinese_roberta_climate_risk_opportunity_prediction_vv2` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chinese_roberta_climate_risk_opportunity_prediction_vv2_en_5.5.0_3.0_1727309273903.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chinese_roberta_climate_risk_opportunity_prediction_vv2_en_5.5.0_3.0_1727309273903.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("chinese_roberta_climate_risk_opportunity_prediction_vv2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("chinese_roberta_climate_risk_opportunity_prediction_vv2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chinese_roberta_climate_risk_opportunity_prediction_vv2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.2 MB| + +## References + +https://huggingface.co/hw2942/chinese-roberta-climate-risk-opportunity-prediction-vv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-chinese_roberta_wwm_ext_chnsenticorp_en.md b/docs/_posts/ahmedlone127/2024-09-26-chinese_roberta_wwm_ext_chnsenticorp_en.md new file mode 100644 index 00000000000000..f88289965e0e9d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-chinese_roberta_wwm_ext_chnsenticorp_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English chinese_roberta_wwm_ext_chnsenticorp BertForSequenceClassification from linfuyou +author: John Snow Labs +name: chinese_roberta_wwm_ext_chnsenticorp +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chinese_roberta_wwm_ext_chnsenticorp` is a English model originally trained by linfuyou. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chinese_roberta_wwm_ext_chnsenticorp_en_5.5.0_3.0_1727366282245.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chinese_roberta_wwm_ext_chnsenticorp_en_5.5.0_3.0_1727366282245.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("chinese_roberta_wwm_ext_chnsenticorp","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("chinese_roberta_wwm_ext_chnsenticorp", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chinese_roberta_wwm_ext_chnsenticorp| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.2 MB| + +## References + +https://huggingface.co/linfuyou/chinese-roberta-wwm-ext-chnsenticorp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-citydistilledmodel_en.md b/docs/_posts/ahmedlone127/2024-09-26-citydistilledmodel_en.md new file mode 100644 index 00000000000000..c328fc3e57c912 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-citydistilledmodel_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English citydistilledmodel BertForSequenceClassification from privacy-tech-lab +author: John Snow Labs +name: citydistilledmodel +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`citydistilledmodel` is a English model originally trained by privacy-tech-lab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/citydistilledmodel_en_5.5.0_3.0_1727339025355.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/citydistilledmodel_en_5.5.0_3.0_1727339025355.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("citydistilledmodel","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("citydistilledmodel", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|citydistilledmodel| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|54.2 MB| + +## References + +https://huggingface.co/privacy-tech-lab/CityDistilledModel \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-citymodel_en.md b/docs/_posts/ahmedlone127/2024-09-26-citymodel_en.md new file mode 100644 index 00000000000000..98a87f1ccb4678 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-citymodel_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English citymodel BertForSequenceClassification from privacy-tech-lab +author: John Snow Labs +name: citymodel +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`citymodel` is a English model originally trained by privacy-tech-lab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/citymodel_en_5.5.0_3.0_1727310280228.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/citymodel_en_5.5.0_3.0_1727310280228.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("citymodel","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("citymodel", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|citymodel| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|54.2 MB| + +## References + +https://huggingface.co/privacy-tech-lab/CityModel \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-citymodel_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-citymodel_pipeline_en.md new file mode 100644 index 00000000000000..cc75909a45608a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-citymodel_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English citymodel_pipeline pipeline BertForSequenceClassification from privacy-tech-lab +author: John Snow Labs +name: citymodel_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`citymodel_pipeline` is a English model originally trained by privacy-tech-lab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/citymodel_pipeline_en_5.5.0_3.0_1727310283308.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/citymodel_pipeline_en_5.5.0_3.0_1727310283308.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("citymodel_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("citymodel_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|citymodel_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|54.2 MB| + +## References + +https://huggingface.co/privacy-tech-lab/CityModel + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-ciuo08cl_4d_2024_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-ciuo08cl_4d_2024_pipeline_en.md new file mode 100644 index 00000000000000..1e13fc1cb1b97b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-ciuo08cl_4d_2024_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English ciuo08cl_4d_2024_pipeline pipeline BertForSequenceClassification from SABE-SENCE +author: John Snow Labs +name: ciuo08cl_4d_2024_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ciuo08cl_4d_2024_pipeline` is a English model originally trained by SABE-SENCE. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ciuo08cl_4d_2024_pipeline_en_5.5.0_3.0_1727367498111.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ciuo08cl_4d_2024_pipeline_en_5.5.0_3.0_1727367498111.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("ciuo08cl_4d_2024_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("ciuo08cl_4d_2024_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ciuo08cl_4d_2024_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|412.4 MB| + +## References + +https://huggingface.co/SABE-SENCE/CIUO08CL_4D_2024 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-clasificador_muchocine_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-clasificador_muchocine_pipeline_en.md new file mode 100644 index 00000000000000..3992e306ff14ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-clasificador_muchocine_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English clasificador_muchocine_pipeline pipeline BertForSequenceClassification from janfsalberto +author: John Snow Labs +name: clasificador_muchocine_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clasificador_muchocine_pipeline` is a English model originally trained by janfsalberto. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clasificador_muchocine_pipeline_en_5.5.0_3.0_1727366403720.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clasificador_muchocine_pipeline_en_5.5.0_3.0_1727366403720.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("clasificador_muchocine_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("clasificador_muchocine_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clasificador_muchocine_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|410.5 MB| + +## References + +https://huggingface.co/janfsalberto/clasificador-muchocine + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-clinicaltrialbiobert_nli4ct_en.md b/docs/_posts/ahmedlone127/2024-09-26-clinicaltrialbiobert_nli4ct_en.md new file mode 100644 index 00000000000000..a358a03c4cbd75 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-clinicaltrialbiobert_nli4ct_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English clinicaltrialbiobert_nli4ct BertForSequenceClassification from domenicrosati +author: John Snow Labs +name: clinicaltrialbiobert_nli4ct +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clinicaltrialbiobert_nli4ct` is a English model originally trained by domenicrosati. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clinicaltrialbiobert_nli4ct_en_5.5.0_3.0_1727316349588.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clinicaltrialbiobert_nli4ct_en_5.5.0_3.0_1727316349588.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("clinicaltrialbiobert_nli4ct","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("clinicaltrialbiobert_nli4ct", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clinicaltrialbiobert_nli4ct| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.0 MB| + +## References + +https://huggingface.co/domenicrosati/ClinicalTrialBioBert-NLI4CT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr19_seed0_en.md b/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr19_seed0_en.md new file mode 100644 index 00000000000000..68f90d63510b07 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr19_seed0_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English cold_fusion_bert_base_uncased_itr19_seed0 BertForSequenceClassification from ibm +author: John Snow Labs +name: cold_fusion_bert_base_uncased_itr19_seed0 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cold_fusion_bert_base_uncased_itr19_seed0` is a English model originally trained by ibm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr19_seed0_en_5.5.0_3.0_1727318534666.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr19_seed0_en_5.5.0_3.0_1727318534666.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("cold_fusion_bert_base_uncased_itr19_seed0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("cold_fusion_bert_base_uncased_itr19_seed0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cold_fusion_bert_base_uncased_itr19_seed0| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.3 MB| + +## References + +https://huggingface.co/ibm/ColD-Fusion-bert-base-uncased-itr19-seed0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr19_seed0_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr19_seed0_pipeline_en.md new file mode 100644 index 00000000000000..e6d94756c027df --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr19_seed0_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English cold_fusion_bert_base_uncased_itr19_seed0_pipeline pipeline BertForSequenceClassification from ibm +author: John Snow Labs +name: cold_fusion_bert_base_uncased_itr19_seed0_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cold_fusion_bert_base_uncased_itr19_seed0_pipeline` is a English model originally trained by ibm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr19_seed0_pipeline_en_5.5.0_3.0_1727318556225.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr19_seed0_pipeline_en_5.5.0_3.0_1727318556225.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("cold_fusion_bert_base_uncased_itr19_seed0_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("cold_fusion_bert_base_uncased_itr19_seed0_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cold_fusion_bert_base_uncased_itr19_seed0_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.3 MB| + +## References + +https://huggingface.co/ibm/ColD-Fusion-bert-base-uncased-itr19-seed0 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr22_seed0_en.md b/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr22_seed0_en.md new file mode 100644 index 00000000000000..5ea65ba9f714f7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr22_seed0_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English cold_fusion_bert_base_uncased_itr22_seed0 BertForSequenceClassification from ibm +author: John Snow Labs +name: cold_fusion_bert_base_uncased_itr22_seed0 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cold_fusion_bert_base_uncased_itr22_seed0` is a English model originally trained by ibm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr22_seed0_en_5.5.0_3.0_1727316211959.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr22_seed0_en_5.5.0_3.0_1727316211959.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("cold_fusion_bert_base_uncased_itr22_seed0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("cold_fusion_bert_base_uncased_itr22_seed0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cold_fusion_bert_base_uncased_itr22_seed0| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.3 MB| + +## References + +https://huggingface.co/ibm/ColD-Fusion-bert-base-uncased-itr22-seed0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr26_seed0_en.md b/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr26_seed0_en.md new file mode 100644 index 00000000000000..6175bbcc7c97a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr26_seed0_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English cold_fusion_bert_base_uncased_itr26_seed0 BertForSequenceClassification from ibm +author: John Snow Labs +name: cold_fusion_bert_base_uncased_itr26_seed0 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cold_fusion_bert_base_uncased_itr26_seed0` is a English model originally trained by ibm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr26_seed0_en_5.5.0_3.0_1727356127928.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr26_seed0_en_5.5.0_3.0_1727356127928.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("cold_fusion_bert_base_uncased_itr26_seed0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("cold_fusion_bert_base_uncased_itr26_seed0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cold_fusion_bert_base_uncased_itr26_seed0| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.3 MB| + +## References + +https://huggingface.co/ibm/ColD-Fusion-bert-base-uncased-itr26-seed0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr28_seed0_en.md b/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr28_seed0_en.md new file mode 100644 index 00000000000000..fa082dfc937076 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr28_seed0_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English cold_fusion_bert_base_uncased_itr28_seed0 BertForSequenceClassification from ibm +author: John Snow Labs +name: cold_fusion_bert_base_uncased_itr28_seed0 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cold_fusion_bert_base_uncased_itr28_seed0` is a English model originally trained by ibm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr28_seed0_en_5.5.0_3.0_1727352392506.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr28_seed0_en_5.5.0_3.0_1727352392506.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("cold_fusion_bert_base_uncased_itr28_seed0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("cold_fusion_bert_base_uncased_itr28_seed0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cold_fusion_bert_base_uncased_itr28_seed0| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.3 MB| + +## References + +https://huggingface.co/ibm/ColD-Fusion-bert-base-uncased-itr28-seed0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr28_seed0_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr28_seed0_pipeline_en.md new file mode 100644 index 00000000000000..94932d3492e987 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr28_seed0_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English cold_fusion_bert_base_uncased_itr28_seed0_pipeline pipeline BertForSequenceClassification from ibm +author: John Snow Labs +name: cold_fusion_bert_base_uncased_itr28_seed0_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cold_fusion_bert_base_uncased_itr28_seed0_pipeline` is a English model originally trained by ibm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr28_seed0_pipeline_en_5.5.0_3.0_1727352423097.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr28_seed0_pipeline_en_5.5.0_3.0_1727352423097.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("cold_fusion_bert_base_uncased_itr28_seed0_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("cold_fusion_bert_base_uncased_itr28_seed0_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cold_fusion_bert_base_uncased_itr28_seed0_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.3 MB| + +## References + +https://huggingface.co/ibm/ColD-Fusion-bert-base-uncased-itr28-seed0 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr4_seed0_en.md b/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr4_seed0_en.md new file mode 100644 index 00000000000000..7afd6934afaac4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr4_seed0_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English cold_fusion_bert_base_uncased_itr4_seed0 BertForSequenceClassification from ibm +author: John Snow Labs +name: cold_fusion_bert_base_uncased_itr4_seed0 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cold_fusion_bert_base_uncased_itr4_seed0` is a English model originally trained by ibm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr4_seed0_en_5.5.0_3.0_1727314348428.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr4_seed0_en_5.5.0_3.0_1727314348428.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("cold_fusion_bert_base_uncased_itr4_seed0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("cold_fusion_bert_base_uncased_itr4_seed0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cold_fusion_bert_base_uncased_itr4_seed0| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.3 MB| + +## References + +https://huggingface.co/ibm/ColD-Fusion-bert-base-uncased-itr4-seed0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr6_seed0_en.md b/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr6_seed0_en.md new file mode 100644 index 00000000000000..f2192889ddc9be --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr6_seed0_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English cold_fusion_bert_base_uncased_itr6_seed0 BertForSequenceClassification from ibm +author: John Snow Labs +name: cold_fusion_bert_base_uncased_itr6_seed0 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cold_fusion_bert_base_uncased_itr6_seed0` is a English model originally trained by ibm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr6_seed0_en_5.5.0_3.0_1727343937613.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr6_seed0_en_5.5.0_3.0_1727343937613.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("cold_fusion_bert_base_uncased_itr6_seed0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("cold_fusion_bert_base_uncased_itr6_seed0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cold_fusion_bert_base_uncased_itr6_seed0| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.3 MB| + +## References + +https://huggingface.co/ibm/ColD-Fusion-bert-base-uncased-itr6-seed0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr6_seed0_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr6_seed0_pipeline_en.md new file mode 100644 index 00000000000000..3fece1a55789a8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr6_seed0_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English cold_fusion_bert_base_uncased_itr6_seed0_pipeline pipeline BertForSequenceClassification from ibm +author: John Snow Labs +name: cold_fusion_bert_base_uncased_itr6_seed0_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cold_fusion_bert_base_uncased_itr6_seed0_pipeline` is a English model originally trained by ibm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr6_seed0_pipeline_en_5.5.0_3.0_1727343958908.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr6_seed0_pipeline_en_5.5.0_3.0_1727343958908.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("cold_fusion_bert_base_uncased_itr6_seed0_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("cold_fusion_bert_base_uncased_itr6_seed0_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cold_fusion_bert_base_uncased_itr6_seed0_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.3 MB| + +## References + +https://huggingface.co/ibm/ColD-Fusion-bert-base-uncased-itr6-seed0 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr8_seed0_en.md b/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr8_seed0_en.md new file mode 100644 index 00000000000000..48221c995441f5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-cold_fusion_bert_base_uncased_itr8_seed0_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English cold_fusion_bert_base_uncased_itr8_seed0 BertForSequenceClassification from ibm +author: John Snow Labs +name: cold_fusion_bert_base_uncased_itr8_seed0 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cold_fusion_bert_base_uncased_itr8_seed0` is a English model originally trained by ibm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr8_seed0_en_5.5.0_3.0_1727344514342.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr8_seed0_en_5.5.0_3.0_1727344514342.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("cold_fusion_bert_base_uncased_itr8_seed0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("cold_fusion_bert_base_uncased_itr8_seed0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cold_fusion_bert_base_uncased_itr8_seed0| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.3 MB| + +## References + +https://huggingface.co/ibm/ColD-Fusion-bert-base-uncased-itr8-seed0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-covid_radbert_en.md b/docs/_posts/ahmedlone127/2024-09-26-covid_radbert_en.md new file mode 100644 index 00000000000000..562063bdc587d7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-covid_radbert_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English covid_radbert BertForSequenceClassification from StanfordAIMI +author: John Snow Labs +name: covid_radbert +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`covid_radbert` is a English model originally trained by StanfordAIMI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/covid_radbert_en_5.5.0_3.0_1727311413774.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/covid_radbert_en_5.5.0_3.0_1727311413774.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("covid_radbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("covid_radbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|covid_radbert| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|404.7 MB| + +## References + +https://huggingface.co/StanfordAIMI/covid-radbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-covid_radbert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-covid_radbert_pipeline_en.md new file mode 100644 index 00000000000000..66468ab8ffe59b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-covid_radbert_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English covid_radbert_pipeline pipeline BertForSequenceClassification from StanfordAIMI +author: John Snow Labs +name: covid_radbert_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`covid_radbert_pipeline` is a English model originally trained by StanfordAIMI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/covid_radbert_pipeline_en_5.5.0_3.0_1727311435314.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/covid_radbert_pipeline_en_5.5.0_3.0_1727311435314.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("covid_radbert_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("covid_radbert_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|covid_radbert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|404.8 MB| + +## References + +https://huggingface.co/StanfordAIMI/covid-radbert + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-cross_encoder_bert_base_stsb_it.md b/docs/_posts/ahmedlone127/2024-09-26-cross_encoder_bert_base_stsb_it.md new file mode 100644 index 00000000000000..a4f9bc235882d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-cross_encoder_bert_base_stsb_it.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Italian cross_encoder_bert_base_stsb BertForSequenceClassification from efederici +author: John Snow Labs +name: cross_encoder_bert_base_stsb +date: 2024-09-26 +tags: [it, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: it +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cross_encoder_bert_base_stsb` is a Italian model originally trained by efederici. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cross_encoder_bert_base_stsb_it_5.5.0_3.0_1727342979171.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cross_encoder_bert_base_stsb_it_5.5.0_3.0_1727342979171.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("cross_encoder_bert_base_stsb","it") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("cross_encoder_bert_base_stsb", "it") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cross_encoder_bert_base_stsb| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|it| +|Size:|414.8 MB| + +## References + +https://huggingface.co/efederici/cross-encoder-bert-base-stsb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-cross_encoder_llamaindex_demo_en.md b/docs/_posts/ahmedlone127/2024-09-26-cross_encoder_llamaindex_demo_en.md new file mode 100644 index 00000000000000..9b0e9476e36201 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-cross_encoder_llamaindex_demo_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English cross_encoder_llamaindex_demo BertForSequenceClassification from qminh369 +author: John Snow Labs +name: cross_encoder_llamaindex_demo +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cross_encoder_llamaindex_demo` is a English model originally trained by qminh369. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cross_encoder_llamaindex_demo_en_5.5.0_3.0_1727340852570.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cross_encoder_llamaindex_demo_en_5.5.0_3.0_1727340852570.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("cross_encoder_llamaindex_demo","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("cross_encoder_llamaindex_demo", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cross_encoder_llamaindex_demo| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|124.2 MB| + +## References + +https://huggingface.co/qminh369/Cross-Encoder-LLamaIndex-Demo \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-cross_encoder_llamaindex_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-cross_encoder_llamaindex_pipeline_en.md new file mode 100644 index 00000000000000..b82e453211020e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-cross_encoder_llamaindex_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English cross_encoder_llamaindex_pipeline pipeline BertForSequenceClassification from kiay123 +author: John Snow Labs +name: cross_encoder_llamaindex_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cross_encoder_llamaindex_pipeline` is a English model originally trained by kiay123. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cross_encoder_llamaindex_pipeline_en_5.5.0_3.0_1727337394988.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cross_encoder_llamaindex_pipeline_en_5.5.0_3.0_1727337394988.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("cross_encoder_llamaindex_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("cross_encoder_llamaindex_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cross_encoder_llamaindex_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|124.2 MB| + +## References + +https://huggingface.co/kiay123/Cross-Encoder-LLamaIndex + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-crowspairs_bert_base_uncased_classifieronly_en.md b/docs/_posts/ahmedlone127/2024-09-26-crowspairs_bert_base_uncased_classifieronly_en.md new file mode 100644 index 00000000000000..c44b6acc5957c4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-crowspairs_bert_base_uncased_classifieronly_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English crowspairs_bert_base_uncased_classifieronly BertForSequenceClassification from henryscheible +author: John Snow Labs +name: crowspairs_bert_base_uncased_classifieronly +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`crowspairs_bert_base_uncased_classifieronly` is a English model originally trained by henryscheible. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/crowspairs_bert_base_uncased_classifieronly_en_5.5.0_3.0_1727320637008.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/crowspairs_bert_base_uncased_classifieronly_en_5.5.0_3.0_1727320637008.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("crowspairs_bert_base_uncased_classifieronly","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("crowspairs_bert_base_uncased_classifieronly", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|crowspairs_bert_base_uncased_classifieronly| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/henryscheible/crowspairs_bert-base-uncased_classifieronly \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-crowspairs_bert_base_uncased_classifieronly_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-crowspairs_bert_base_uncased_classifieronly_pipeline_en.md new file mode 100644 index 00000000000000..cef1343c4b7d03 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-crowspairs_bert_base_uncased_classifieronly_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English crowspairs_bert_base_uncased_classifieronly_pipeline pipeline BertForSequenceClassification from henryscheible +author: John Snow Labs +name: crowspairs_bert_base_uncased_classifieronly_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`crowspairs_bert_base_uncased_classifieronly_pipeline` is a English model originally trained by henryscheible. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/crowspairs_bert_base_uncased_classifieronly_pipeline_en_5.5.0_3.0_1727320658543.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/crowspairs_bert_base_uncased_classifieronly_pipeline_en_5.5.0_3.0_1727320658543.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("crowspairs_bert_base_uncased_classifieronly_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("crowspairs_bert_base_uncased_classifieronly_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|crowspairs_bert_base_uncased_classifieronly_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/henryscheible/crowspairs_bert-base-uncased_classifieronly + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-cs431_camera_coqe_csi_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-cs431_camera_coqe_csi_pipeline_en.md new file mode 100644 index 00000000000000..202ec53e68e017 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-cs431_camera_coqe_csi_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English cs431_camera_coqe_csi_pipeline pipeline BertForSequenceClassification from ThuyNT03 +author: John Snow Labs +name: cs431_camera_coqe_csi_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cs431_camera_coqe_csi_pipeline` is a English model originally trained by ThuyNT03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cs431_camera_coqe_csi_pipeline_en_5.5.0_3.0_1727355208870.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cs431_camera_coqe_csi_pipeline_en_5.5.0_3.0_1727355208870.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("cs431_camera_coqe_csi_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("cs431_camera_coqe_csi_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cs431_camera_coqe_csi_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ThuyNT03/CS431_Camera-COQE_CSI + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-cs431_camera_coqe_csi_v4_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-cs431_camera_coqe_csi_v4_pipeline_en.md new file mode 100644 index 00000000000000..239a09ddbf2539 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-cs431_camera_coqe_csi_v4_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English cs431_camera_coqe_csi_v4_pipeline pipeline BertForSequenceClassification from ThuyNT03 +author: John Snow Labs +name: cs431_camera_coqe_csi_v4_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cs431_camera_coqe_csi_v4_pipeline` is a English model originally trained by ThuyNT03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cs431_camera_coqe_csi_v4_pipeline_en_5.5.0_3.0_1727343897302.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cs431_camera_coqe_csi_v4_pipeline_en_5.5.0_3.0_1727343897302.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("cs431_camera_coqe_csi_v4_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("cs431_camera_coqe_csi_v4_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cs431_camera_coqe_csi_v4_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ThuyNT03/CS431_Camera-COQE_CSI_v4 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-danish_sentiment_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-danish_sentiment_pipeline_en.md new file mode 100644 index 00000000000000..572386daf7f259 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-danish_sentiment_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English danish_sentiment_pipeline pipeline BertForSequenceClassification from mirfan899 +author: John Snow Labs +name: danish_sentiment_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`danish_sentiment_pipeline` is a English model originally trained by mirfan899. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/danish_sentiment_pipeline_en_5.5.0_3.0_1727342321447.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/danish_sentiment_pipeline_en_5.5.0_3.0_1727342321447.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("danish_sentiment_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("danish_sentiment_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|danish_sentiment_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|414.5 MB| + +## References + +https://huggingface.co/mirfan899/da-sentiment + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-dbpedia_classes_bert_base_uncased_few_100_f_en.md b/docs/_posts/ahmedlone127/2024-09-26-dbpedia_classes_bert_base_uncased_few_100_f_en.md new file mode 100644 index 00000000000000..3d16514c26a157 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-dbpedia_classes_bert_base_uncased_few_100_f_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English dbpedia_classes_bert_base_uncased_few_100_f BertForSequenceClassification from TheChickenAgent +author: John Snow Labs +name: dbpedia_classes_bert_base_uncased_few_100_f +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dbpedia_classes_bert_base_uncased_few_100_f` is a English model originally trained by TheChickenAgent. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dbpedia_classes_bert_base_uncased_few_100_f_en_5.5.0_3.0_1727356299488.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dbpedia_classes_bert_base_uncased_few_100_f_en_5.5.0_3.0_1727356299488.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("dbpedia_classes_bert_base_uncased_few_100_f","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("dbpedia_classes_bert_base_uncased_few_100_f", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dbpedia_classes_bert_base_uncased_few_100_f| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/TheChickenAgent/DBPedia_Classes_BERT-base-uncased-few-100-F \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-dbpedia_classes_bert_base_uncased_few_100_f_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-dbpedia_classes_bert_base_uncased_few_100_f_pipeline_en.md new file mode 100644 index 00000000000000..0bc66db745ef78 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-dbpedia_classes_bert_base_uncased_few_100_f_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English dbpedia_classes_bert_base_uncased_few_100_f_pipeline pipeline BertForSequenceClassification from TheChickenAgent +author: John Snow Labs +name: dbpedia_classes_bert_base_uncased_few_100_f_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dbpedia_classes_bert_base_uncased_few_100_f_pipeline` is a English model originally trained by TheChickenAgent. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dbpedia_classes_bert_base_uncased_few_100_f_pipeline_en_5.5.0_3.0_1727356320158.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dbpedia_classes_bert_base_uncased_few_100_f_pipeline_en_5.5.0_3.0_1727356320158.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("dbpedia_classes_bert_base_uncased_few_100_f_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("dbpedia_classes_bert_base_uncased_few_100_f_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dbpedia_classes_bert_base_uncased_few_100_f_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/TheChickenAgent/DBPedia_Classes_BERT-base-uncased-few-100-F + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-dbpedia_classes_bert_base_uncased_few_20_baseline_en.md b/docs/_posts/ahmedlone127/2024-09-26-dbpedia_classes_bert_base_uncased_few_20_baseline_en.md new file mode 100644 index 00000000000000..c8030ab335b1a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-dbpedia_classes_bert_base_uncased_few_20_baseline_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English dbpedia_classes_bert_base_uncased_few_20_baseline BertForSequenceClassification from TheChickenAgent +author: John Snow Labs +name: dbpedia_classes_bert_base_uncased_few_20_baseline +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dbpedia_classes_bert_base_uncased_few_20_baseline` is a English model originally trained by TheChickenAgent. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dbpedia_classes_bert_base_uncased_few_20_baseline_en_5.5.0_3.0_1727321917925.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dbpedia_classes_bert_base_uncased_few_20_baseline_en_5.5.0_3.0_1727321917925.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("dbpedia_classes_bert_base_uncased_few_20_baseline","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("dbpedia_classes_bert_base_uncased_few_20_baseline", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dbpedia_classes_bert_base_uncased_few_20_baseline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/TheChickenAgent/DBPedia_Classes_BERT-base-uncased-few-20-baseline \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-dbpedia_classes_bert_base_uncased_few_20_baseline_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-dbpedia_classes_bert_base_uncased_few_20_baseline_pipeline_en.md new file mode 100644 index 00000000000000..5ba394b06e5054 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-dbpedia_classes_bert_base_uncased_few_20_baseline_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English dbpedia_classes_bert_base_uncased_few_20_baseline_pipeline pipeline BertForSequenceClassification from TheChickenAgent +author: John Snow Labs +name: dbpedia_classes_bert_base_uncased_few_20_baseline_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dbpedia_classes_bert_base_uncased_few_20_baseline_pipeline` is a English model originally trained by TheChickenAgent. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dbpedia_classes_bert_base_uncased_few_20_baseline_pipeline_en_5.5.0_3.0_1727321938787.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dbpedia_classes_bert_base_uncased_few_20_baseline_pipeline_en_5.5.0_3.0_1727321938787.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("dbpedia_classes_bert_base_uncased_few_20_baseline_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("dbpedia_classes_bert_base_uncased_few_20_baseline_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dbpedia_classes_bert_base_uncased_few_20_baseline_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/TheChickenAgent/DBPedia_Classes_BERT-base-uncased-few-20-baseline + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-dbpedia_classes_bert_base_uncased_few_20_f_en.md b/docs/_posts/ahmedlone127/2024-09-26-dbpedia_classes_bert_base_uncased_few_20_f_en.md new file mode 100644 index 00000000000000..61cb2150501993 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-dbpedia_classes_bert_base_uncased_few_20_f_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English dbpedia_classes_bert_base_uncased_few_20_f BertForSequenceClassification from TheChickenAgent +author: John Snow Labs +name: dbpedia_classes_bert_base_uncased_few_20_f +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dbpedia_classes_bert_base_uncased_few_20_f` is a English model originally trained by TheChickenAgent. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dbpedia_classes_bert_base_uncased_few_20_f_en_5.5.0_3.0_1727321570096.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dbpedia_classes_bert_base_uncased_few_20_f_en_5.5.0_3.0_1727321570096.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("dbpedia_classes_bert_base_uncased_few_20_f","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("dbpedia_classes_bert_base_uncased_few_20_f", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dbpedia_classes_bert_base_uncased_few_20_f| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/TheChickenAgent/DBPedia_Classes_BERT-base-uncased-few-20-F \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-dbpedia_classes_bert_base_uncased_few_20_f_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-dbpedia_classes_bert_base_uncased_few_20_f_pipeline_en.md new file mode 100644 index 00000000000000..e6dd655129d802 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-dbpedia_classes_bert_base_uncased_few_20_f_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English dbpedia_classes_bert_base_uncased_few_20_f_pipeline pipeline BertForSequenceClassification from TheChickenAgent +author: John Snow Labs +name: dbpedia_classes_bert_base_uncased_few_20_f_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dbpedia_classes_bert_base_uncased_few_20_f_pipeline` is a English model originally trained by TheChickenAgent. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dbpedia_classes_bert_base_uncased_few_20_f_pipeline_en_5.5.0_3.0_1727321590991.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dbpedia_classes_bert_base_uncased_few_20_f_pipeline_en_5.5.0_3.0_1727321590991.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("dbpedia_classes_bert_base_uncased_few_20_f_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("dbpedia_classes_bert_base_uncased_few_20_f_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dbpedia_classes_bert_base_uncased_few_20_f_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/TheChickenAgent/DBPedia_Classes_BERT-base-uncased-few-20-F + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-decipher_the_code_explainer_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-decipher_the_code_explainer_pipeline_en.md new file mode 100644 index 00000000000000..0f1170cd388506 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-decipher_the_code_explainer_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English decipher_the_code_explainer_pipeline pipeline BertForSequenceClassification from tharun1507 +author: John Snow Labs +name: decipher_the_code_explainer_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`decipher_the_code_explainer_pipeline` is a English model originally trained by tharun1507. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/decipher_the_code_explainer_pipeline_en_5.5.0_3.0_1727328434652.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/decipher_the_code_explainer_pipeline_en_5.5.0_3.0_1727328434652.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("decipher_the_code_explainer_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("decipher_the_code_explainer_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|decipher_the_code_explainer_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tharun1507/Decipher-the_code_explainer + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-depression_ai_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-depression_ai_pipeline_en.md new file mode 100644 index 00000000000000..8c5c1a52ab28fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-depression_ai_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English depression_ai_pipeline pipeline BertForSequenceClassification from PranjalSingh01 +author: John Snow Labs +name: depression_ai_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`depression_ai_pipeline` is a English model originally trained by PranjalSingh01. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/depression_ai_pipeline_en_5.5.0_3.0_1727352575109.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/depression_ai_pipeline_en_5.5.0_3.0_1727352575109.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("depression_ai_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("depression_ai_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|depression_ai_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/PranjalSingh01/depression-ai + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-depression_and_non_depression_classifier_en.md b/docs/_posts/ahmedlone127/2024-09-26-depression_and_non_depression_classifier_en.md new file mode 100644 index 00000000000000..7f1d8f0930d21d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-depression_and_non_depression_classifier_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English depression_and_non_depression_classifier BertForSequenceClassification from poudel +author: John Snow Labs +name: depression_and_non_depression_classifier +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`depression_and_non_depression_classifier` is a English model originally trained by poudel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/depression_and_non_depression_classifier_en_5.5.0_3.0_1727361220669.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/depression_and_non_depression_classifier_en_5.5.0_3.0_1727361220669.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("depression_and_non_depression_classifier","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("depression_and_non_depression_classifier", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|depression_and_non_depression_classifier| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/poudel/Depression_and_Non-Depression_Classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-depression_and_non_depression_classifier_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-depression_and_non_depression_classifier_pipeline_en.md new file mode 100644 index 00000000000000..f1259d9e6654bf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-depression_and_non_depression_classifier_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English depression_and_non_depression_classifier_pipeline pipeline BertForSequenceClassification from poudel +author: John Snow Labs +name: depression_and_non_depression_classifier_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`depression_and_non_depression_classifier_pipeline` is a English model originally trained by poudel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/depression_and_non_depression_classifier_pipeline_en_5.5.0_3.0_1727361241807.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/depression_and_non_depression_classifier_pipeline_en_5.5.0_3.0_1727361241807.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("depression_and_non_depression_classifier_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("depression_and_non_depression_classifier_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|depression_and_non_depression_classifier_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/poudel/Depression_and_Non-Depression_Classifier + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-dialogue_en.md b/docs/_posts/ahmedlone127/2024-09-26-dialogue_en.md new file mode 100644 index 00000000000000..36f380e341bc5f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-dialogue_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English dialogue BertForSequenceClassification from SharonTudi +author: John Snow Labs +name: dialogue +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dialogue` is a English model originally trained by SharonTudi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dialogue_en_5.5.0_3.0_1727315263268.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dialogue_en_5.5.0_3.0_1727315263268.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("dialogue","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("dialogue", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dialogue| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/SharonTudi/DIALOGUE \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-dialogue_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-dialogue_pipeline_en.md new file mode 100644 index 00000000000000..e9ac90184285e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-dialogue_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English dialogue_pipeline pipeline BertForSequenceClassification from SharonTudi +author: John Snow Labs +name: dialogue_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dialogue_pipeline` is a English model originally trained by SharonTudi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dialogue_pipeline_en_5.5.0_3.0_1727315285222.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dialogue_pipeline_en_5.5.0_3.0_1727315285222.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("dialogue_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("dialogue_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dialogue_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/SharonTudi/DIALOGUE + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-disease_classifier_base_en.md b/docs/_posts/ahmedlone127/2024-09-26-disease_classifier_base_en.md new file mode 100644 index 00000000000000..7e183599315f35 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-disease_classifier_base_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English disease_classifier_base BertForSequenceClassification from shanover +author: John Snow Labs +name: disease_classifier_base +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`disease_classifier_base` is a English model originally trained by shanover. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/disease_classifier_base_en_5.5.0_3.0_1727353091812.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/disease_classifier_base_en_5.5.0_3.0_1727353091812.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("disease_classifier_base","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("disease_classifier_base", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|disease_classifier_base| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/shanover/disease_classifier_base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-distilbert_base_sst2_en.md b/docs/_posts/ahmedlone127/2024-09-26-distilbert_base_sst2_en.md new file mode 100644 index 00000000000000..590ced6802dbff --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-distilbert_base_sst2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_sst2 BertForSequenceClassification from Vishnou +author: John Snow Labs +name: distilbert_base_sst2 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_sst2` is a English model originally trained by Vishnou. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_sst2_en_5.5.0_3.0_1727357400968.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_sst2_en_5.5.0_3.0_1727357400968.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("distilbert_base_sst2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("distilbert_base_sst2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_sst2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|54.2 MB| + +## References + +https://huggingface.co/Vishnou/distilbert_base_SST2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-distilbert_base_sst2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-distilbert_base_sst2_pipeline_en.md new file mode 100644 index 00000000000000..17d5e5fd9f5827 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-distilbert_base_sst2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_sst2_pipeline pipeline BertForSequenceClassification from Vishnou +author: John Snow Labs +name: distilbert_base_sst2_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_sst2_pipeline` is a English model originally trained by Vishnou. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_sst2_pipeline_en_5.5.0_3.0_1727357404304.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_sst2_pipeline_en_5.5.0_3.0_1727357404304.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_sst2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_sst2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_sst2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|54.2 MB| + +## References + +https://huggingface.co/Vishnou/distilbert_base_SST2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-distilbert_base_uncased_finetuned_irony_en.md b/docs/_posts/ahmedlone127/2024-09-26-distilbert_base_uncased_finetuned_irony_en.md new file mode 100644 index 00000000000000..cb862bb56d0b49 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-distilbert_base_uncased_finetuned_irony_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_irony BertForSequenceClassification from niktasadr98 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_irony +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_irony` is a English model originally trained by niktasadr98. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_irony_en_5.5.0_3.0_1727329337408.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_irony_en_5.5.0_3.0_1727329337408.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_irony","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_irony", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_irony| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/niktasadr98/distilbert-base-uncased-finetuned-irony \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-distilbert_hate_speech18_en.md b/docs/_posts/ahmedlone127/2024-09-26-distilbert_hate_speech18_en.md new file mode 100644 index 00000000000000..2bb501ce0ce383 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-distilbert_hate_speech18_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_hate_speech18 BertForSequenceClassification from joseph10 +author: John Snow Labs +name: distilbert_hate_speech18 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_hate_speech18` is a English model originally trained by joseph10. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_hate_speech18_en_5.5.0_3.0_1727330287493.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_hate_speech18_en_5.5.0_3.0_1727330287493.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("distilbert_hate_speech18","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("distilbert_hate_speech18", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_hate_speech18| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/joseph10/distilbert-hate_speech18 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-distilbert_hate_speech18_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-distilbert_hate_speech18_pipeline_en.md new file mode 100644 index 00000000000000..b9aafca18bc4ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-distilbert_hate_speech18_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_hate_speech18_pipeline pipeline BertForSequenceClassification from joseph10 +author: John Snow Labs +name: distilbert_hate_speech18_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_hate_speech18_pipeline` is a English model originally trained by joseph10. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_hate_speech18_pipeline_en_5.5.0_3.0_1727330309008.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_hate_speech18_pipeline_en_5.5.0_3.0_1727330309008.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_hate_speech18_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_hate_speech18_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_hate_speech18_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/joseph10/distilbert-hate_speech18 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-emb_crossenc_msmarco_teacher_3_bert_large_wwm_en.md b/docs/_posts/ahmedlone127/2024-09-26-emb_crossenc_msmarco_teacher_3_bert_large_wwm_en.md new file mode 100644 index 00000000000000..56ce4d5565740a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-emb_crossenc_msmarco_teacher_3_bert_large_wwm_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English emb_crossenc_msmarco_teacher_3_bert_large_wwm BertForSequenceClassification from nishantyadav +author: John Snow Labs +name: emb_crossenc_msmarco_teacher_3_bert_large_wwm +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`emb_crossenc_msmarco_teacher_3_bert_large_wwm` is a English model originally trained by nishantyadav. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/emb_crossenc_msmarco_teacher_3_bert_large_wwm_en_5.5.0_3.0_1727322110882.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/emb_crossenc_msmarco_teacher_3_bert_large_wwm_en_5.5.0_3.0_1727322110882.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("emb_crossenc_msmarco_teacher_3_bert_large_wwm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("emb_crossenc_msmarco_teacher_3_bert_large_wwm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|emb_crossenc_msmarco_teacher_3_bert_large_wwm| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/nishantyadav/emb_crossenc_msmarco_teacher_3_bert_large_wwm \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-emoberttamil_en.md b/docs/_posts/ahmedlone127/2024-09-26-emoberttamil_en.md new file mode 100644 index 00000000000000..750054f24396e1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-emoberttamil_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English emoberttamil BertForSequenceClassification from DeadBeast +author: John Snow Labs +name: emoberttamil +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`emoberttamil` is a English model originally trained by DeadBeast. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/emoberttamil_en_5.5.0_3.0_1727335183048.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/emoberttamil_en_5.5.0_3.0_1727335183048.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("emoberttamil","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("emoberttamil", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|emoberttamil| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/DeadBeast/emoBERTTamil \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-emoberttamil_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-emoberttamil_pipeline_en.md new file mode 100644 index 00000000000000..c3f5b92b4973f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-emoberttamil_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English emoberttamil_pipeline pipeline BertForSequenceClassification from DeadBeast +author: John Snow Labs +name: emoberttamil_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`emoberttamil_pipeline` is a English model originally trained by DeadBeast. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/emoberttamil_pipeline_en_5.5.0_3.0_1727335204714.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/emoberttamil_pipeline_en_5.5.0_3.0_1727335204714.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("emoberttamil_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("emoberttamil_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|emoberttamil_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/DeadBeast/emoBERTTamil + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-english_astitchtask1a_bertbasecased_truefalse_0_4_best_en.md b/docs/_posts/ahmedlone127/2024-09-26-english_astitchtask1a_bertbasecased_truefalse_0_4_best_en.md new file mode 100644 index 00000000000000..e17d6680671d4d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-english_astitchtask1a_bertbasecased_truefalse_0_4_best_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English english_astitchtask1a_bertbasecased_truefalse_0_4_best BertForSequenceClassification from harish +author: John Snow Labs +name: english_astitchtask1a_bertbasecased_truefalse_0_4_best +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`english_astitchtask1a_bertbasecased_truefalse_0_4_best` is a English model originally trained by harish. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/english_astitchtask1a_bertbasecased_truefalse_0_4_best_en_5.5.0_3.0_1727310127577.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/english_astitchtask1a_bertbasecased_truefalse_0_4_best_en_5.5.0_3.0_1727310127577.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("english_astitchtask1a_bertbasecased_truefalse_0_4_best","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("english_astitchtask1a_bertbasecased_truefalse_0_4_best", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|english_astitchtask1a_bertbasecased_truefalse_0_4_best| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/harish/EN-AStitchTask1A-BERTBaseCased-TrueFalse-0-4-BEST \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-english_astitchtask1a_bertbasecased_truefalse_0_4_best_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-english_astitchtask1a_bertbasecased_truefalse_0_4_best_pipeline_en.md new file mode 100644 index 00000000000000..769addd17c8a52 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-english_astitchtask1a_bertbasecased_truefalse_0_4_best_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English english_astitchtask1a_bertbasecased_truefalse_0_4_best_pipeline pipeline BertForSequenceClassification from harish +author: John Snow Labs +name: english_astitchtask1a_bertbasecased_truefalse_0_4_best_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`english_astitchtask1a_bertbasecased_truefalse_0_4_best_pipeline` is a English model originally trained by harish. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/english_astitchtask1a_bertbasecased_truefalse_0_4_best_pipeline_en_5.5.0_3.0_1727310148259.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/english_astitchtask1a_bertbasecased_truefalse_0_4_best_pipeline_en_5.5.0_3.0_1727310148259.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("english_astitchtask1a_bertbasecased_truefalse_0_4_best_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("english_astitchtask1a_bertbasecased_truefalse_0_4_best_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|english_astitchtask1a_bertbasecased_truefalse_0_4_best_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/harish/EN-AStitchTask1A-BERTBaseCased-TrueFalse-0-4-BEST + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-english_astitchtask1a_bertbasecased_truetrue_0_3_best_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-english_astitchtask1a_bertbasecased_truetrue_0_3_best_pipeline_en.md new file mode 100644 index 00000000000000..7f6fe239ead3f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-english_astitchtask1a_bertbasecased_truetrue_0_3_best_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English english_astitchtask1a_bertbasecased_truetrue_0_3_best_pipeline pipeline BertForSequenceClassification from harish +author: John Snow Labs +name: english_astitchtask1a_bertbasecased_truetrue_0_3_best_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`english_astitchtask1a_bertbasecased_truetrue_0_3_best_pipeline` is a English model originally trained by harish. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/english_astitchtask1a_bertbasecased_truetrue_0_3_best_pipeline_en_5.5.0_3.0_1727343465451.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/english_astitchtask1a_bertbasecased_truetrue_0_3_best_pipeline_en_5.5.0_3.0_1727343465451.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("english_astitchtask1a_bertbasecased_truetrue_0_3_best_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("english_astitchtask1a_bertbasecased_truetrue_0_3_best_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|english_astitchtask1a_bertbasecased_truetrue_0_3_best_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/harish/EN-AStitchTask1A-BERTBaseCased-TrueTrue-0-3-BEST + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-english_bert_base_multilingual_uncased_sentiment_run2_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-09-26-english_bert_base_multilingual_uncased_sentiment_run2_pipeline_xx.md new file mode 100644 index 00000000000000..aacf6497799bc7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-english_bert_base_multilingual_uncased_sentiment_run2_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual english_bert_base_multilingual_uncased_sentiment_run2_pipeline pipeline BertForSequenceClassification from gunkaynar +author: John Snow Labs +name: english_bert_base_multilingual_uncased_sentiment_run2_pipeline +date: 2024-09-26 +tags: [xx, open_source, pipeline, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`english_bert_base_multilingual_uncased_sentiment_run2_pipeline` is a Multilingual model originally trained by gunkaynar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/english_bert_base_multilingual_uncased_sentiment_run2_pipeline_xx_5.5.0_3.0_1727309277059.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/english_bert_base_multilingual_uncased_sentiment_run2_pipeline_xx_5.5.0_3.0_1727309277059.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("english_bert_base_multilingual_uncased_sentiment_run2_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("english_bert_base_multilingual_uncased_sentiment_run2_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|english_bert_base_multilingual_uncased_sentiment_run2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|627.8 MB| + +## References + +https://huggingface.co/gunkaynar/en-bert-base-multilingual-uncased-sentiment_run2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-english_bert_base_multilingual_uncased_sentiment_run2_xx.md b/docs/_posts/ahmedlone127/2024-09-26-english_bert_base_multilingual_uncased_sentiment_run2_xx.md new file mode 100644 index 00000000000000..86b60f29c564b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-english_bert_base_multilingual_uncased_sentiment_run2_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual english_bert_base_multilingual_uncased_sentiment_run2 BertForSequenceClassification from gunkaynar +author: John Snow Labs +name: english_bert_base_multilingual_uncased_sentiment_run2 +date: 2024-09-26 +tags: [xx, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`english_bert_base_multilingual_uncased_sentiment_run2` is a Multilingual model originally trained by gunkaynar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/english_bert_base_multilingual_uncased_sentiment_run2_xx_5.5.0_3.0_1727309242935.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/english_bert_base_multilingual_uncased_sentiment_run2_xx_5.5.0_3.0_1727309242935.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("english_bert_base_multilingual_uncased_sentiment_run2","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("english_bert_base_multilingual_uncased_sentiment_run2", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|english_bert_base_multilingual_uncased_sentiment_run2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|627.8 MB| + +## References + +https://huggingface.co/gunkaynar/en-bert-base-multilingual-uncased-sentiment_run2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-epiclassify4gard_en.md b/docs/_posts/ahmedlone127/2024-09-26-epiclassify4gard_en.md new file mode 100644 index 00000000000000..b39a06724074f9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-epiclassify4gard_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English epiclassify4gard BertForSequenceClassification from ncats +author: John Snow Labs +name: epiclassify4gard +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`epiclassify4gard` is a English model originally trained by ncats. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/epiclassify4gard_en_5.5.0_3.0_1727310309752.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/epiclassify4gard_en_5.5.0_3.0_1727310309752.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("epiclassify4gard","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("epiclassify4gard", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|epiclassify4gard| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/ncats/EpiClassify4GARD \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-fake_news_bert_base_spanish_wwm_cased_jorgeortizfuentes_en.md b/docs/_posts/ahmedlone127/2024-09-26-fake_news_bert_base_spanish_wwm_cased_jorgeortizfuentes_en.md new file mode 100644 index 00000000000000..d9d20e43a5a8b2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-fake_news_bert_base_spanish_wwm_cased_jorgeortizfuentes_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English fake_news_bert_base_spanish_wwm_cased_jorgeortizfuentes BertForSequenceClassification from jorgeortizfuentes +author: John Snow Labs +name: fake_news_bert_base_spanish_wwm_cased_jorgeortizfuentes +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fake_news_bert_base_spanish_wwm_cased_jorgeortizfuentes` is a English model originally trained by jorgeortizfuentes. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fake_news_bert_base_spanish_wwm_cased_jorgeortizfuentes_en_5.5.0_3.0_1727342283160.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fake_news_bert_base_spanish_wwm_cased_jorgeortizfuentes_en_5.5.0_3.0_1727342283160.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("fake_news_bert_base_spanish_wwm_cased_jorgeortizfuentes","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("fake_news_bert_base_spanish_wwm_cased_jorgeortizfuentes", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fake_news_bert_base_spanish_wwm_cased_jorgeortizfuentes| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|411.7 MB| + +## References + +https://huggingface.co/jorgeortizfuentes/fake-news-bert-base-spanish-wwm-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-fakenews_bert_base_cased_arjun24420_en.md b/docs/_posts/ahmedlone127/2024-09-26-fakenews_bert_base_cased_arjun24420_en.md new file mode 100644 index 00000000000000..5398aaddd4a059 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-fakenews_bert_base_cased_arjun24420_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English fakenews_bert_base_cased_arjun24420 BertForSequenceClassification from Arjun24420 +author: John Snow Labs +name: fakenews_bert_base_cased_arjun24420 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fakenews_bert_base_cased_arjun24420` is a English model originally trained by Arjun24420. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fakenews_bert_base_cased_arjun24420_en_5.5.0_3.0_1727349551209.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fakenews_bert_base_cased_arjun24420_en_5.5.0_3.0_1727349551209.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("fakenews_bert_base_cased_arjun24420","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("fakenews_bert_base_cased_arjun24420", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fakenews_bert_base_cased_arjun24420| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/Arjun24420/FakeNews-BERT-base-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-fakenews_bert_base_cased_arjun24420_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-fakenews_bert_base_cased_arjun24420_pipeline_en.md new file mode 100644 index 00000000000000..92637fb15f6d7f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-fakenews_bert_base_cased_arjun24420_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English fakenews_bert_base_cased_arjun24420_pipeline pipeline BertForSequenceClassification from Arjun24420 +author: John Snow Labs +name: fakenews_bert_base_cased_arjun24420_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fakenews_bert_base_cased_arjun24420_pipeline` is a English model originally trained by Arjun24420. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fakenews_bert_base_cased_arjun24420_pipeline_en_5.5.0_3.0_1727349572603.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fakenews_bert_base_cased_arjun24420_pipeline_en_5.5.0_3.0_1727349572603.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("fakenews_bert_base_cased_arjun24420_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("fakenews_bert_base_cased_arjun24420_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fakenews_bert_base_cased_arjun24420_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/Arjun24420/FakeNews-BERT-base-cased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-fakenews_bert_base_cased_punct_en.md b/docs/_posts/ahmedlone127/2024-09-26-fakenews_bert_base_cased_punct_en.md new file mode 100644 index 00000000000000..c192c0733d41bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-fakenews_bert_base_cased_punct_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English fakenews_bert_base_cased_punct BertForSequenceClassification from Denyol +author: John Snow Labs +name: fakenews_bert_base_cased_punct +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fakenews_bert_base_cased_punct` is a English model originally trained by Denyol. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fakenews_bert_base_cased_punct_en_5.5.0_3.0_1727351613761.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fakenews_bert_base_cased_punct_en_5.5.0_3.0_1727351613761.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("fakenews_bert_base_cased_punct","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("fakenews_bert_base_cased_punct", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fakenews_bert_base_cased_punct| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/Denyol/FakeNews-bert-base-cased-punct \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-fakenews_bert_large_cased_grad_en.md b/docs/_posts/ahmedlone127/2024-09-26-fakenews_bert_large_cased_grad_en.md new file mode 100644 index 00000000000000..09318d1c2fbfc9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-fakenews_bert_large_cased_grad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English fakenews_bert_large_cased_grad BertForSequenceClassification from Denyol +author: John Snow Labs +name: fakenews_bert_large_cased_grad +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fakenews_bert_large_cased_grad` is a English model originally trained by Denyol. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fakenews_bert_large_cased_grad_en_5.5.0_3.0_1727312642070.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fakenews_bert_large_cased_grad_en_5.5.0_3.0_1727312642070.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("fakenews_bert_large_cased_grad","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("fakenews_bert_large_cased_grad", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fakenews_bert_large_cased_grad| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/Denyol/FakeNews-bert-large-cased-grad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-fakenews_bert_large_cased_grad_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-fakenews_bert_large_cased_grad_pipeline_en.md new file mode 100644 index 00000000000000..146d74a32c3b3e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-fakenews_bert_large_cased_grad_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English fakenews_bert_large_cased_grad_pipeline pipeline BertForSequenceClassification from Denyol +author: John Snow Labs +name: fakenews_bert_large_cased_grad_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fakenews_bert_large_cased_grad_pipeline` is a English model originally trained by Denyol. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fakenews_bert_large_cased_grad_pipeline_en_5.5.0_3.0_1727312708035.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fakenews_bert_large_cased_grad_pipeline_en_5.5.0_3.0_1727312708035.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("fakenews_bert_large_cased_grad_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("fakenews_bert_large_cased_grad_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fakenews_bert_large_cased_grad_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/Denyol/FakeNews-bert-large-cased-grad + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-fieldclassifier_en.md b/docs/_posts/ahmedlone127/2024-09-26-fieldclassifier_en.md new file mode 100644 index 00000000000000..882edf20e8df18 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-fieldclassifier_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English fieldclassifier BertForSequenceClassification from CleveGreen +author: John Snow Labs +name: fieldclassifier +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fieldclassifier` is a English model originally trained by CleveGreen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fieldclassifier_en_5.5.0_3.0_1727313944499.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fieldclassifier_en_5.5.0_3.0_1727313944499.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("fieldclassifier","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("fieldclassifier", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fieldclassifier| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/CleveGreen/FieldClassifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-fieldclassifier_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-fieldclassifier_pipeline_en.md new file mode 100644 index 00000000000000..0f2d03ca4b809d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-fieldclassifier_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English fieldclassifier_pipeline pipeline BertForSequenceClassification from CleveGreen +author: John Snow Labs +name: fieldclassifier_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fieldclassifier_pipeline` is a English model originally trained by CleveGreen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fieldclassifier_pipeline_en_5.5.0_3.0_1727313965786.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fieldclassifier_pipeline_en_5.5.0_3.0_1727313965786.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("fieldclassifier_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("fieldclassifier_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fieldclassifier_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.0 MB| + +## References + +https://huggingface.co/CleveGreen/FieldClassifier + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-final_project_finetuned_bert_base_multilingual_cased_french_xx.md b/docs/_posts/ahmedlone127/2024-09-26-final_project_finetuned_bert_base_multilingual_cased_french_xx.md new file mode 100644 index 00000000000000..4cc0d897ef8500 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-final_project_finetuned_bert_base_multilingual_cased_french_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual final_project_finetuned_bert_base_multilingual_cased_french BertForSequenceClassification from Worgu +author: John Snow Labs +name: final_project_finetuned_bert_base_multilingual_cased_french +date: 2024-09-26 +tags: [xx, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`final_project_finetuned_bert_base_multilingual_cased_french` is a Multilingual model originally trained by Worgu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/final_project_finetuned_bert_base_multilingual_cased_french_xx_5.5.0_3.0_1727351248435.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/final_project_finetuned_bert_base_multilingual_cased_french_xx_5.5.0_3.0_1727351248435.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("final_project_finetuned_bert_base_multilingual_cased_french","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("final_project_finetuned_bert_base_multilingual_cased_french", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|final_project_finetuned_bert_base_multilingual_cased_french| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/Worgu/Final_Project_finetuned_bert-base-multilingual-cased_french \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-finbert_sentiment_v1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-finbert_sentiment_v1_pipeline_en.md new file mode 100644 index 00000000000000..aa8089335020e7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-finbert_sentiment_v1_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English finbert_sentiment_v1_pipeline pipeline BertForSequenceClassification from rifatozkurt +author: John Snow Labs +name: finbert_sentiment_v1_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finbert_sentiment_v1_pipeline` is a English model originally trained by rifatozkurt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finbert_sentiment_v1_pipeline_en_5.5.0_3.0_1727352244802.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finbert_sentiment_v1_pipeline_en_5.5.0_3.0_1727352244802.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("finbert_sentiment_v1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("finbert_sentiment_v1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finbert_sentiment_v1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|411.7 MB| + +## References + +https://huggingface.co/rifatozkurt/finbert-sentiment-v1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-fine_tuned_bert_base_cased_sentiment_analysis_en.md b/docs/_posts/ahmedlone127/2024-09-26-fine_tuned_bert_base_cased_sentiment_analysis_en.md new file mode 100644 index 00000000000000..693d004b2e6076 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-fine_tuned_bert_base_cased_sentiment_analysis_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English fine_tuned_bert_base_cased_sentiment_analysis BertForSequenceClassification from Mawulom +author: John Snow Labs +name: fine_tuned_bert_base_cased_sentiment_analysis +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_bert_base_cased_sentiment_analysis` is a English model originally trained by Mawulom. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_bert_base_cased_sentiment_analysis_en_5.5.0_3.0_1727347616583.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_bert_base_cased_sentiment_analysis_en_5.5.0_3.0_1727347616583.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("fine_tuned_bert_base_cased_sentiment_analysis","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("fine_tuned_bert_base_cased_sentiment_analysis", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_bert_base_cased_sentiment_analysis| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/Mawulom/Fine-Tuned-Bert_Base_Cased_Sentiment_Analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-fine_tuned_bert_base_uncased_theknight115_en.md b/docs/_posts/ahmedlone127/2024-09-26-fine_tuned_bert_base_uncased_theknight115_en.md new file mode 100644 index 00000000000000..b65d4daa148310 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-fine_tuned_bert_base_uncased_theknight115_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English fine_tuned_bert_base_uncased_theknight115 BertForSequenceClassification from TheKnight115 +author: John Snow Labs +name: fine_tuned_bert_base_uncased_theknight115 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_bert_base_uncased_theknight115` is a English model originally trained by TheKnight115. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_bert_base_uncased_theknight115_en_5.5.0_3.0_1727370206483.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_bert_base_uncased_theknight115_en_5.5.0_3.0_1727370206483.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("fine_tuned_bert_base_uncased_theknight115","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("fine_tuned_bert_base_uncased_theknight115", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_bert_base_uncased_theknight115| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/TheKnight115/fine-tuned-bert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-fine_tuned_bert_large_uncased_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-fine_tuned_bert_large_uncased_pipeline_en.md new file mode 100644 index 00000000000000..cb840a33f60b34 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-fine_tuned_bert_large_uncased_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English fine_tuned_bert_large_uncased_pipeline pipeline BertForSequenceClassification from Mawulom +author: John Snow Labs +name: fine_tuned_bert_large_uncased_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_bert_large_uncased_pipeline` is a English model originally trained by Mawulom. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_bert_large_uncased_pipeline_en_5.5.0_3.0_1727308839803.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_bert_large_uncased_pipeline_en_5.5.0_3.0_1727308839803.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("fine_tuned_bert_large_uncased_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("fine_tuned_bert_large_uncased_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_bert_large_uncased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/Mawulom/Fine-Tuned-Bert-Large-Uncased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-fine_tuned_bertforsequenceclassification_en.md b/docs/_posts/ahmedlone127/2024-09-26-fine_tuned_bertforsequenceclassification_en.md new file mode 100644 index 00000000000000..826085a2ceebd3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-fine_tuned_bertforsequenceclassification_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English fine_tuned_bertforsequenceclassification BertForSequenceClassification from nish700 +author: John Snow Labs +name: fine_tuned_bertforsequenceclassification +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_bertforsequenceclassification` is a English model originally trained by nish700. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_bertforsequenceclassification_en_5.5.0_3.0_1727309446049.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_bertforsequenceclassification_en_5.5.0_3.0_1727309446049.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("fine_tuned_bertforsequenceclassification","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("fine_tuned_bertforsequenceclassification", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_bertforsequenceclassification| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/nish700/fine-tuned-BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-fine_tuned_bertforsequenceclassification_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-fine_tuned_bertforsequenceclassification_pipeline_en.md new file mode 100644 index 00000000000000..bf6fb11f91c4cf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-fine_tuned_bertforsequenceclassification_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English fine_tuned_bertforsequenceclassification_pipeline pipeline BertForSequenceClassification from nish700 +author: John Snow Labs +name: fine_tuned_bertforsequenceclassification_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_bertforsequenceclassification_pipeline` is a English model originally trained by nish700. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_bertforsequenceclassification_pipeline_en_5.5.0_3.0_1727309467895.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_bertforsequenceclassification_pipeline_en_5.5.0_3.0_1727309467895.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("fine_tuned_bertforsequenceclassification_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("fine_tuned_bertforsequenceclassification_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_bertforsequenceclassification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/nish700/fine-tuned-BertForSequenceClassification + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-finetuned_bert_base_german_cased_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-finetuned_bert_base_german_cased_pipeline_en.md new file mode 100644 index 00000000000000..04c41737f9ab0a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-finetuned_bert_base_german_cased_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English finetuned_bert_base_german_cased_pipeline pipeline BertForSequenceClassification from CodeWithSwap01 +author: John Snow Labs +name: finetuned_bert_base_german_cased_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_bert_base_german_cased_pipeline` is a English model originally trained by CodeWithSwap01. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_bert_base_german_cased_pipeline_en_5.5.0_3.0_1727341781288.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_bert_base_german_cased_pipeline_en_5.5.0_3.0_1727341781288.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("finetuned_bert_base_german_cased_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("finetuned_bert_base_german_cased_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_bert_base_german_cased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.1 MB| + +## References + +https://huggingface.co/CodeWithSwap01/finetuned-bert-base-german-cased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-finetuned_bert_base_on_iemocap_2_en.md b/docs/_posts/ahmedlone127/2024-09-26-finetuned_bert_base_on_iemocap_2_en.md new file mode 100644 index 00000000000000..56a31c2863b869 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-finetuned_bert_base_on_iemocap_2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English finetuned_bert_base_on_iemocap_2 BertForSequenceClassification from minoosh +author: John Snow Labs +name: finetuned_bert_base_on_iemocap_2 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_bert_base_on_iemocap_2` is a English model originally trained by minoosh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_bert_base_on_iemocap_2_en_5.5.0_3.0_1727349573567.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_bert_base_on_iemocap_2_en_5.5.0_3.0_1727349573567.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("finetuned_bert_base_on_iemocap_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finetuned_bert_base_on_iemocap_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_bert_base_on_iemocap_2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/minoosh/finetuned_bert-base-on-IEMOCAP_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-finetuned_bert_base_on_iemocap_2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-finetuned_bert_base_on_iemocap_2_pipeline_en.md new file mode 100644 index 00000000000000..a9aac90a4922af --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-finetuned_bert_base_on_iemocap_2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English finetuned_bert_base_on_iemocap_2_pipeline pipeline BertForSequenceClassification from minoosh +author: John Snow Labs +name: finetuned_bert_base_on_iemocap_2_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_bert_base_on_iemocap_2_pipeline` is a English model originally trained by minoosh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_bert_base_on_iemocap_2_pipeline_en_5.5.0_3.0_1727349600179.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_bert_base_on_iemocap_2_pipeline_en_5.5.0_3.0_1727349600179.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("finetuned_bert_base_on_iemocap_2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("finetuned_bert_base_on_iemocap_2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_bert_base_on_iemocap_2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/minoosh/finetuned_bert-base-on-IEMOCAP_2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-finetuned_marbert_arabic_emotional_analysis_ar.md b/docs/_posts/ahmedlone127/2024-09-26-finetuned_marbert_arabic_emotional_analysis_ar.md new file mode 100644 index 00000000000000..9a6c7560d1cbed --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-finetuned_marbert_arabic_emotional_analysis_ar.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Arabic finetuned_marbert_arabic_emotional_analysis BertForSequenceClassification from TheKnight115 +author: John Snow Labs +name: finetuned_marbert_arabic_emotional_analysis +date: 2024-09-26 +tags: [ar, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: ar +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_marbert_arabic_emotional_analysis` is a Arabic model originally trained by TheKnight115. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_marbert_arabic_emotional_analysis_ar_5.5.0_3.0_1727346127583.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_marbert_arabic_emotional_analysis_ar_5.5.0_3.0_1727346127583.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("finetuned_marbert_arabic_emotional_analysis","ar") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finetuned_marbert_arabic_emotional_analysis", "ar") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_marbert_arabic_emotional_analysis| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ar| +|Size:|608.8 MB| + +## References + +https://huggingface.co/TheKnight115/Finetuned_MarBERT_Arabic_Emotional_Analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-finetuned_marbert_arabic_emotional_analysis_pipeline_ar.md b/docs/_posts/ahmedlone127/2024-09-26-finetuned_marbert_arabic_emotional_analysis_pipeline_ar.md new file mode 100644 index 00000000000000..080356d4060821 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-finetuned_marbert_arabic_emotional_analysis_pipeline_ar.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Arabic finetuned_marbert_arabic_emotional_analysis_pipeline pipeline BertForSequenceClassification from TheKnight115 +author: John Snow Labs +name: finetuned_marbert_arabic_emotional_analysis_pipeline +date: 2024-09-26 +tags: [ar, open_source, pipeline, onnx] +task: Text Classification +language: ar +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_marbert_arabic_emotional_analysis_pipeline` is a Arabic model originally trained by TheKnight115. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_marbert_arabic_emotional_analysis_pipeline_ar_5.5.0_3.0_1727346160439.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_marbert_arabic_emotional_analysis_pipeline_ar_5.5.0_3.0_1727346160439.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("finetuned_marbert_arabic_emotional_analysis_pipeline", lang = "ar") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("finetuned_marbert_arabic_emotional_analysis_pipeline", lang = "ar") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_marbert_arabic_emotional_analysis_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|ar| +|Size:|608.8 MB| + +## References + +https://huggingface.co/TheKnight115/Finetuned_MarBERT_Arabic_Emotional_Analysis + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-finetuning_bert_base_uncased_on_amazon_polarity_7000_samples_en.md b/docs/_posts/ahmedlone127/2024-09-26-finetuning_bert_base_uncased_on_amazon_polarity_7000_samples_en.md new file mode 100644 index 00000000000000..83c4eccec2d0df --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-finetuning_bert_base_uncased_on_amazon_polarity_7000_samples_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English finetuning_bert_base_uncased_on_amazon_polarity_7000_samples BertForSequenceClassification from Ibrahim-Alam +author: John Snow Labs +name: finetuning_bert_base_uncased_on_amazon_polarity_7000_samples +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_bert_base_uncased_on_amazon_polarity_7000_samples` is a English model originally trained by Ibrahim-Alam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_bert_base_uncased_on_amazon_polarity_7000_samples_en_5.5.0_3.0_1727354127208.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_bert_base_uncased_on_amazon_polarity_7000_samples_en_5.5.0_3.0_1727354127208.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("finetuning_bert_base_uncased_on_amazon_polarity_7000_samples","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finetuning_bert_base_uncased_on_amazon_polarity_7000_samples", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_bert_base_uncased_on_amazon_polarity_7000_samples| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Ibrahim-Alam/finetuning-bert-base-uncased-on-amazon_polarity_7000_samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-finetuning_bert_base_uncased_on_amazon_polarity_7000_samples_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-finetuning_bert_base_uncased_on_amazon_polarity_7000_samples_pipeline_en.md new file mode 100644 index 00000000000000..b8464e6f17d39f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-finetuning_bert_base_uncased_on_amazon_polarity_7000_samples_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English finetuning_bert_base_uncased_on_amazon_polarity_7000_samples_pipeline pipeline BertForSequenceClassification from Ibrahim-Alam +author: John Snow Labs +name: finetuning_bert_base_uncased_on_amazon_polarity_7000_samples_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_bert_base_uncased_on_amazon_polarity_7000_samples_pipeline` is a English model originally trained by Ibrahim-Alam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_bert_base_uncased_on_amazon_polarity_7000_samples_pipeline_en_5.5.0_3.0_1727354148091.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_bert_base_uncased_on_amazon_polarity_7000_samples_pipeline_en_5.5.0_3.0_1727354148091.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("finetuning_bert_base_uncased_on_amazon_polarity_7000_samples_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("finetuning_bert_base_uncased_on_amazon_polarity_7000_samples_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_bert_base_uncased_on_amazon_polarity_7000_samples_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Ibrahim-Alam/finetuning-bert-base-uncased-on-amazon_polarity_7000_samples + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-finetuning_bert_base_uncased_on_imdb_7000_samples_en.md b/docs/_posts/ahmedlone127/2024-09-26-finetuning_bert_base_uncased_on_imdb_7000_samples_en.md new file mode 100644 index 00000000000000..f4bfe574d3782d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-finetuning_bert_base_uncased_on_imdb_7000_samples_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English finetuning_bert_base_uncased_on_imdb_7000_samples BertForSequenceClassification from Ibrahim-Alam +author: John Snow Labs +name: finetuning_bert_base_uncased_on_imdb_7000_samples +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_bert_base_uncased_on_imdb_7000_samples` is a English model originally trained by Ibrahim-Alam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_bert_base_uncased_on_imdb_7000_samples_en_5.5.0_3.0_1727313515146.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_bert_base_uncased_on_imdb_7000_samples_en_5.5.0_3.0_1727313515146.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("finetuning_bert_base_uncased_on_imdb_7000_samples","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finetuning_bert_base_uncased_on_imdb_7000_samples", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_bert_base_uncased_on_imdb_7000_samples| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Ibrahim-Alam/finetuning-bert-base-uncased-on-imdb_7000_samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-finetuning_bert_base_uncased_on_yelp_polarity_7000_samples_en.md b/docs/_posts/ahmedlone127/2024-09-26-finetuning_bert_base_uncased_on_yelp_polarity_7000_samples_en.md new file mode 100644 index 00000000000000..516c91800dd0c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-finetuning_bert_base_uncased_on_yelp_polarity_7000_samples_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English finetuning_bert_base_uncased_on_yelp_polarity_7000_samples BertForSequenceClassification from Ibrahim-Alam +author: John Snow Labs +name: finetuning_bert_base_uncased_on_yelp_polarity_7000_samples +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_bert_base_uncased_on_yelp_polarity_7000_samples` is a English model originally trained by Ibrahim-Alam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_bert_base_uncased_on_yelp_polarity_7000_samples_en_5.5.0_3.0_1727346528761.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_bert_base_uncased_on_yelp_polarity_7000_samples_en_5.5.0_3.0_1727346528761.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("finetuning_bert_base_uncased_on_yelp_polarity_7000_samples","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finetuning_bert_base_uncased_on_yelp_polarity_7000_samples", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_bert_base_uncased_on_yelp_polarity_7000_samples| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Ibrahim-Alam/finetuning-bert-base-uncased-on-yelp_polarity_7000_samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-finetuning_bert_base_uncased_on_yelp_polarity_7000_samples_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-finetuning_bert_base_uncased_on_yelp_polarity_7000_samples_pipeline_en.md new file mode 100644 index 00000000000000..54e1123e07ed12 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-finetuning_bert_base_uncased_on_yelp_polarity_7000_samples_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English finetuning_bert_base_uncased_on_yelp_polarity_7000_samples_pipeline pipeline BertForSequenceClassification from Ibrahim-Alam +author: John Snow Labs +name: finetuning_bert_base_uncased_on_yelp_polarity_7000_samples_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_bert_base_uncased_on_yelp_polarity_7000_samples_pipeline` is a English model originally trained by Ibrahim-Alam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_bert_base_uncased_on_yelp_polarity_7000_samples_pipeline_en_5.5.0_3.0_1727346550487.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_bert_base_uncased_on_yelp_polarity_7000_samples_pipeline_en_5.5.0_3.0_1727346550487.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("finetuning_bert_base_uncased_on_yelp_polarity_7000_samples_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("finetuning_bert_base_uncased_on_yelp_polarity_7000_samples_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_bert_base_uncased_on_yelp_polarity_7000_samples_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Ibrahim-Alam/finetuning-bert-base-uncased-on-yelp_polarity_7000_samples + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-finetuning_llms_project_2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-finetuning_llms_project_2_pipeline_en.md new file mode 100644 index 00000000000000..71d3a4c6b1134c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-finetuning_llms_project_2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English finetuning_llms_project_2_pipeline pipeline BertForSequenceClassification from jessica-ecosia +author: John Snow Labs +name: finetuning_llms_project_2_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_llms_project_2_pipeline` is a English model originally trained by jessica-ecosia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_llms_project_2_pipeline_en_5.5.0_3.0_1727335394524.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_llms_project_2_pipeline_en_5.5.0_3.0_1727335394524.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("finetuning_llms_project_2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("finetuning_llms_project_2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_llms_project_2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jessica-ecosia/finetuning-llms-project-2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-finetuning_sentiment_model_3000_samples_shubham166_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-finetuning_sentiment_model_3000_samples_shubham166_pipeline_en.md new file mode 100644 index 00000000000000..3ee025610fdaf7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-finetuning_sentiment_model_3000_samples_shubham166_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_shubham166_pipeline pipeline BertForSequenceClassification from shubham166 +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_shubham166_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_shubham166_pipeline` is a English model originally trained by shubham166. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_shubham166_pipeline_en_5.5.0_3.0_1727317680425.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_shubham166_pipeline_en_5.5.0_3.0_1727317680425.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("finetuning_sentiment_model_3000_samples_shubham166_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("finetuning_sentiment_model_3000_samples_shubham166_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_shubham166_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/shubham166/finetuning-sentiment-model-3000-samples-shubham166 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-finetuning_sentiment_model_bert_base_25000_samples_en.md b/docs/_posts/ahmedlone127/2024-09-26-finetuning_sentiment_model_bert_base_25000_samples_en.md new file mode 100644 index 00000000000000..797dcbf60a3de3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-finetuning_sentiment_model_bert_base_25000_samples_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English finetuning_sentiment_model_bert_base_25000_samples BertForSequenceClassification from choidf +author: John Snow Labs +name: finetuning_sentiment_model_bert_base_25000_samples +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_bert_base_25000_samples` is a English model originally trained by choidf. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_bert_base_25000_samples_en_5.5.0_3.0_1727318361144.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_bert_base_25000_samples_en_5.5.0_3.0_1727318361144.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("finetuning_sentiment_model_bert_base_25000_samples","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finetuning_sentiment_model_bert_base_25000_samples", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_bert_base_25000_samples| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/choidf/finetuning-sentiment-model-bert-base-25000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-finetuning_sentiment_model_bert_base_25000_samples_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-finetuning_sentiment_model_bert_base_25000_samples_pipeline_en.md new file mode 100644 index 00000000000000..53417c8cb8e8bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-finetuning_sentiment_model_bert_base_25000_samples_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English finetuning_sentiment_model_bert_base_25000_samples_pipeline pipeline BertForSequenceClassification from choidf +author: John Snow Labs +name: finetuning_sentiment_model_bert_base_25000_samples_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_bert_base_25000_samples_pipeline` is a English model originally trained by choidf. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_bert_base_25000_samples_pipeline_en_5.5.0_3.0_1727318383533.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_bert_base_25000_samples_pipeline_en_5.5.0_3.0_1727318383533.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("finetuning_sentiment_model_bert_base_25000_samples_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("finetuning_sentiment_model_bert_base_25000_samples_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_bert_base_25000_samples_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/choidf/finetuning-sentiment-model-bert-base-25000-samples + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-finetuning_sentiment_model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-finetuning_sentiment_model_pipeline_en.md new file mode 100644 index 00000000000000..bdc8a6f0caf562 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-finetuning_sentiment_model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English finetuning_sentiment_model_pipeline pipeline BertForSequenceClassification from BJ-1018 +author: John Snow Labs +name: finetuning_sentiment_model_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_pipeline` is a English model originally trained by BJ-1018. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_pipeline_en_5.5.0_3.0_1727317566930.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_pipeline_en_5.5.0_3.0_1727317566930.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("finetuning_sentiment_model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("finetuning_sentiment_model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/BJ-1018/finetuning-sentiment-model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-frugalscore_small_bert_base_mover_score_en.md b/docs/_posts/ahmedlone127/2024-09-26-frugalscore_small_bert_base_mover_score_en.md new file mode 100644 index 00000000000000..e732c5780cae3d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-frugalscore_small_bert_base_mover_score_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English frugalscore_small_bert_base_mover_score BertForSequenceClassification from moussaKam +author: John Snow Labs +name: frugalscore_small_bert_base_mover_score +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`frugalscore_small_bert_base_mover_score` is a English model originally trained by moussaKam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/frugalscore_small_bert_base_mover_score_en_5.5.0_3.0_1727363510227.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/frugalscore_small_bert_base_mover_score_en_5.5.0_3.0_1727363510227.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("frugalscore_small_bert_base_mover_score","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("frugalscore_small_bert_base_mover_score", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|frugalscore_small_bert_base_mover_score| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|107.9 MB| + +## References + +https://huggingface.co/moussaKam/frugalscore_small_bert-base_mover-score \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-frugalscore_small_bert_base_mover_score_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-frugalscore_small_bert_base_mover_score_pipeline_en.md new file mode 100644 index 00000000000000..df53980393fe88 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-frugalscore_small_bert_base_mover_score_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English frugalscore_small_bert_base_mover_score_pipeline pipeline BertForSequenceClassification from moussaKam +author: John Snow Labs +name: frugalscore_small_bert_base_mover_score_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`frugalscore_small_bert_base_mover_score_pipeline` is a English model originally trained by moussaKam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/frugalscore_small_bert_base_mover_score_pipeline_en_5.5.0_3.0_1727363515742.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/frugalscore_small_bert_base_mover_score_pipeline_en_5.5.0_3.0_1727363515742.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("frugalscore_small_bert_base_mover_score_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("frugalscore_small_bert_base_mover_score_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|frugalscore_small_bert_base_mover_score_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|108.0 MB| + +## References + +https://huggingface.co/moussaKam/frugalscore_small_bert-base_mover-score + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-frugalscore_small_deberta_bert_score_en.md b/docs/_posts/ahmedlone127/2024-09-26-frugalscore_small_deberta_bert_score_en.md new file mode 100644 index 00000000000000..a62300fc7a15b6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-frugalscore_small_deberta_bert_score_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English frugalscore_small_deberta_bert_score BertForSequenceClassification from moussaKam +author: John Snow Labs +name: frugalscore_small_deberta_bert_score +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`frugalscore_small_deberta_bert_score` is a English model originally trained by moussaKam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/frugalscore_small_deberta_bert_score_en_5.5.0_3.0_1727332840630.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/frugalscore_small_deberta_bert_score_en_5.5.0_3.0_1727332840630.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("frugalscore_small_deberta_bert_score","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("frugalscore_small_deberta_bert_score", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|frugalscore_small_deberta_bert_score| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|107.9 MB| + +## References + +https://huggingface.co/moussaKam/frugalscore_small_deberta_bert-score \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-frugalscore_small_deberta_bert_score_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-frugalscore_small_deberta_bert_score_pipeline_en.md new file mode 100644 index 00000000000000..add04b0b6978e3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-frugalscore_small_deberta_bert_score_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English frugalscore_small_deberta_bert_score_pipeline pipeline BertForSequenceClassification from moussaKam +author: John Snow Labs +name: frugalscore_small_deberta_bert_score_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`frugalscore_small_deberta_bert_score_pipeline` is a English model originally trained by moussaKam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/frugalscore_small_deberta_bert_score_pipeline_en_5.5.0_3.0_1727332846196.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/frugalscore_small_deberta_bert_score_pipeline_en_5.5.0_3.0_1727332846196.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("frugalscore_small_deberta_bert_score_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("frugalscore_small_deberta_bert_score_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|frugalscore_small_deberta_bert_score_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|108.0 MB| + +## References + +https://huggingface.co/moussaKam/frugalscore_small_deberta_bert-score + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-geocoder_relevancy_model_en.md b/docs/_posts/ahmedlone127/2024-09-26-geocoder_relevancy_model_en.md new file mode 100644 index 00000000000000..8563e7cfc6f739 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-geocoder_relevancy_model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English geocoder_relevancy_model BertForSequenceClassification from azamat +author: John Snow Labs +name: geocoder_relevancy_model +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`geocoder_relevancy_model` is a English model originally trained by azamat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/geocoder_relevancy_model_en_5.5.0_3.0_1727310373681.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/geocoder_relevancy_model_en_5.5.0_3.0_1727310373681.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("geocoder_relevancy_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("geocoder_relevancy_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|geocoder_relevancy_model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/azamat/geocoder_relevancy_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-geocoder_relevancy_model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-geocoder_relevancy_model_pipeline_en.md new file mode 100644 index 00000000000000..462345c1fc7d7a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-geocoder_relevancy_model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English geocoder_relevancy_model_pipeline pipeline BertForSequenceClassification from azamat +author: John Snow Labs +name: geocoder_relevancy_model_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`geocoder_relevancy_model_pipeline` is a English model originally trained by azamat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/geocoder_relevancy_model_pipeline_en_5.5.0_3.0_1727310408728.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/geocoder_relevancy_model_pipeline_en_5.5.0_3.0_1727310408728.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("geocoder_relevancy_model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("geocoder_relevancy_model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|geocoder_relevancy_model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/azamat/geocoder_relevancy_model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-german_toxicity_classifier_plus_de.md b/docs/_posts/ahmedlone127/2024-09-26-german_toxicity_classifier_plus_de.md new file mode 100644 index 00000000000000..7edac51b38b277 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-german_toxicity_classifier_plus_de.md @@ -0,0 +1,94 @@ +--- +layout: model +title: German german_toxicity_classifier_plus BertForSequenceClassification from EIStakovskii +author: John Snow Labs +name: german_toxicity_classifier_plus +date: 2024-09-26 +tags: [de, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: de +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`german_toxicity_classifier_plus` is a German model originally trained by EIStakovskii. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/german_toxicity_classifier_plus_de_5.5.0_3.0_1727320799996.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/german_toxicity_classifier_plus_de_5.5.0_3.0_1727320799996.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("german_toxicity_classifier_plus","de") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("german_toxicity_classifier_plus", "de") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|german_toxicity_classifier_plus| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|de| +|Size:|408.1 MB| + +## References + +https://huggingface.co/EIStakovskii/german_toxicity_classifier_plus \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-german_toxicity_classifier_plus_pipeline_de.md b/docs/_posts/ahmedlone127/2024-09-26-german_toxicity_classifier_plus_pipeline_de.md new file mode 100644 index 00000000000000..4947fca247cc99 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-german_toxicity_classifier_plus_pipeline_de.md @@ -0,0 +1,70 @@ +--- +layout: model +title: German german_toxicity_classifier_plus_pipeline pipeline BertForSequenceClassification from EIStakovskii +author: John Snow Labs +name: german_toxicity_classifier_plus_pipeline +date: 2024-09-26 +tags: [de, open_source, pipeline, onnx] +task: Text Classification +language: de +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`german_toxicity_classifier_plus_pipeline` is a German model originally trained by EIStakovskii. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/german_toxicity_classifier_plus_pipeline_de_5.5.0_3.0_1727320821339.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/german_toxicity_classifier_plus_pipeline_de_5.5.0_3.0_1727320821339.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("german_toxicity_classifier_plus_pipeline", lang = "de") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("german_toxicity_classifier_plus_pipeline", lang = "de") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|german_toxicity_classifier_plus_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|de| +|Size:|408.1 MB| + +## References + +https://huggingface.co/EIStakovskii/german_toxicity_classifier_plus + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-gfmgenderdetection_en.md b/docs/_posts/ahmedlone127/2024-09-26-gfmgenderdetection_en.md new file mode 100644 index 00000000000000..d6ca6c84b8cbf8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-gfmgenderdetection_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English gfmgenderdetection BertForSequenceClassification from sasi2400 +author: John Snow Labs +name: gfmgenderdetection +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gfmgenderdetection` is a English model originally trained by sasi2400. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gfmgenderdetection_en_5.5.0_3.0_1727312059881.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gfmgenderdetection_en_5.5.0_3.0_1727312059881.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("gfmgenderdetection","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("gfmgenderdetection", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gfmgenderdetection| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/sasi2400/GFMgenderDetection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-gfmgenderdetection_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-gfmgenderdetection_pipeline_en.md new file mode 100644 index 00000000000000..04a5218ae8fb5c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-gfmgenderdetection_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English gfmgenderdetection_pipeline pipeline BertForSequenceClassification from sasi2400 +author: John Snow Labs +name: gfmgenderdetection_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gfmgenderdetection_pipeline` is a English model originally trained by sasi2400. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gfmgenderdetection_pipeline_en_5.5.0_3.0_1727312094668.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gfmgenderdetection_pipeline_en_5.5.0_3.0_1727312094668.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("gfmgenderdetection_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("gfmgenderdetection_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gfmgenderdetection_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/sasi2400/GFMgenderDetection + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-google_bert_base_uncased_en.md b/docs/_posts/ahmedlone127/2024-09-26-google_bert_base_uncased_en.md new file mode 100644 index 00000000000000..0d093fc594e043 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-google_bert_base_uncased_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English google_bert_base_uncased BertForSequenceClassification from ajeetkumar01 +author: John Snow Labs +name: google_bert_base_uncased +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`google_bert_base_uncased` is a English model originally trained by ajeetkumar01. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/google_bert_base_uncased_en_5.5.0_3.0_1727341982277.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/google_bert_base_uncased_en_5.5.0_3.0_1727341982277.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("google_bert_base_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("google_bert_base_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|google_bert_base_uncased| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ajeetkumar01/google-bert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-gpc_brick_klassifikator_en.md b/docs/_posts/ahmedlone127/2024-09-26-gpc_brick_klassifikator_en.md new file mode 100644 index 00000000000000..16b94880c7b5b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-gpc_brick_klassifikator_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English gpc_brick_klassifikator BertForSequenceClassification from sianbrumm +author: John Snow Labs +name: gpc_brick_klassifikator +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gpc_brick_klassifikator` is a English model originally trained by sianbrumm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gpc_brick_klassifikator_en_5.5.0_3.0_1727330876160.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gpc_brick_klassifikator_en_5.5.0_3.0_1727330876160.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("gpc_brick_klassifikator","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("gpc_brick_klassifikator", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gpc_brick_klassifikator| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|412.7 MB| + +## References + +https://huggingface.co/sianbrumm/GPC_Brick_Klassifikator \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-gpc_brick_klassifikator_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-gpc_brick_klassifikator_pipeline_en.md new file mode 100644 index 00000000000000..338d770b266681 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-gpc_brick_klassifikator_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English gpc_brick_klassifikator_pipeline pipeline BertForSequenceClassification from sianbrumm +author: John Snow Labs +name: gpc_brick_klassifikator_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gpc_brick_klassifikator_pipeline` is a English model originally trained by sianbrumm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gpc_brick_klassifikator_pipeline_en_5.5.0_3.0_1727330896789.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gpc_brick_klassifikator_pipeline_en_5.5.0_3.0_1727330896789.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("gpc_brick_klassifikator_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("gpc_brick_klassifikator_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gpc_brick_klassifikator_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|412.7 MB| + +## References + +https://huggingface.co/sianbrumm/GPC_Brick_Klassifikator + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-halacha_siman_seif_classifier_ankishoot_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-halacha_siman_seif_classifier_ankishoot_pipeline_en.md new file mode 100644 index 00000000000000..ce4f7e42095d1c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-halacha_siman_seif_classifier_ankishoot_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English halacha_siman_seif_classifier_ankishoot_pipeline pipeline BertForSequenceClassification from sivan22 +author: John Snow Labs +name: halacha_siman_seif_classifier_ankishoot_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`halacha_siman_seif_classifier_ankishoot_pipeline` is a English model originally trained by sivan22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/halacha_siman_seif_classifier_ankishoot_pipeline_en_5.5.0_3.0_1727320812874.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/halacha_siman_seif_classifier_ankishoot_pipeline_en_5.5.0_3.0_1727320812874.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("halacha_siman_seif_classifier_ankishoot_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("halacha_siman_seif_classifier_ankishoot_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|halacha_siman_seif_classifier_ankishoot_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|704.2 MB| + +## References + +https://huggingface.co/sivan22/halacha-siman-seif-classifier-ankiShoot + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-halacha_siman_seif_classifier_he.md b/docs/_posts/ahmedlone127/2024-09-26-halacha_siman_seif_classifier_he.md new file mode 100644 index 00000000000000..b1bc23867b5ccd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-halacha_siman_seif_classifier_he.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Hebrew halacha_siman_seif_classifier BertForSequenceClassification from sivan22 +author: John Snow Labs +name: halacha_siman_seif_classifier +date: 2024-09-26 +tags: [he, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: he +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`halacha_siman_seif_classifier` is a Hebrew model originally trained by sivan22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/halacha_siman_seif_classifier_he_5.5.0_3.0_1727331471180.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/halacha_siman_seif_classifier_he_5.5.0_3.0_1727331471180.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("halacha_siman_seif_classifier","he") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("halacha_siman_seif_classifier", "he") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|halacha_siman_seif_classifier| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|he| +|Size:|704.2 MB| + +## References + +https://huggingface.co/sivan22/halacha-siman-seif-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-halacha_siman_seif_classifier_pipeline_he.md b/docs/_posts/ahmedlone127/2024-09-26-halacha_siman_seif_classifier_pipeline_he.md new file mode 100644 index 00000000000000..07d036c075fe15 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-halacha_siman_seif_classifier_pipeline_he.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Hebrew halacha_siman_seif_classifier_pipeline pipeline BertForSequenceClassification from sivan22 +author: John Snow Labs +name: halacha_siman_seif_classifier_pipeline +date: 2024-09-26 +tags: [he, open_source, pipeline, onnx] +task: Text Classification +language: he +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`halacha_siman_seif_classifier_pipeline` is a Hebrew model originally trained by sivan22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/halacha_siman_seif_classifier_pipeline_he_5.5.0_3.0_1727331507902.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/halacha_siman_seif_classifier_pipeline_he_5.5.0_3.0_1727331507902.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("halacha_siman_seif_classifier_pipeline", lang = "he") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("halacha_siman_seif_classifier_pipeline", lang = "he") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|halacha_siman_seif_classifier_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|he| +|Size:|704.2 MB| + +## References + +https://huggingface.co/sivan22/halacha-siman-seif-classifier + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-hasoc19_bert_base_multilingual_uncased_sentiment_nepal_bhasa_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-09-26-hasoc19_bert_base_multilingual_uncased_sentiment_nepal_bhasa_pipeline_xx.md new file mode 100644 index 00000000000000..78aa705718d763 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-hasoc19_bert_base_multilingual_uncased_sentiment_nepal_bhasa_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual hasoc19_bert_base_multilingual_uncased_sentiment_nepal_bhasa_pipeline pipeline BertForSequenceClassification from SiddharthaM +author: John Snow Labs +name: hasoc19_bert_base_multilingual_uncased_sentiment_nepal_bhasa_pipeline +date: 2024-09-26 +tags: [xx, open_source, pipeline, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hasoc19_bert_base_multilingual_uncased_sentiment_nepal_bhasa_pipeline` is a Multilingual model originally trained by SiddharthaM. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hasoc19_bert_base_multilingual_uncased_sentiment_nepal_bhasa_pipeline_xx_5.5.0_3.0_1727317484942.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hasoc19_bert_base_multilingual_uncased_sentiment_nepal_bhasa_pipeline_xx_5.5.0_3.0_1727317484942.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("hasoc19_bert_base_multilingual_uncased_sentiment_nepal_bhasa_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("hasoc19_bert_base_multilingual_uncased_sentiment_nepal_bhasa_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hasoc19_bert_base_multilingual_uncased_sentiment_nepal_bhasa_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|627.8 MB| + +## References + +https://huggingface.co/SiddharthaM/hasoc19-bert-base-multilingual-uncased-sentiment-new + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-hasoc19_bert_base_multilingual_uncased_sentiment_nepal_bhasa_xx.md b/docs/_posts/ahmedlone127/2024-09-26-hasoc19_bert_base_multilingual_uncased_sentiment_nepal_bhasa_xx.md new file mode 100644 index 00000000000000..e6ec57c6365b04 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-hasoc19_bert_base_multilingual_uncased_sentiment_nepal_bhasa_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual hasoc19_bert_base_multilingual_uncased_sentiment_nepal_bhasa BertForSequenceClassification from SiddharthaM +author: John Snow Labs +name: hasoc19_bert_base_multilingual_uncased_sentiment_nepal_bhasa +date: 2024-09-26 +tags: [xx, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hasoc19_bert_base_multilingual_uncased_sentiment_nepal_bhasa` is a Multilingual model originally trained by SiddharthaM. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hasoc19_bert_base_multilingual_uncased_sentiment_nepal_bhasa_xx_5.5.0_3.0_1727317451701.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hasoc19_bert_base_multilingual_uncased_sentiment_nepal_bhasa_xx_5.5.0_3.0_1727317451701.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("hasoc19_bert_base_multilingual_uncased_sentiment_nepal_bhasa","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hasoc19_bert_base_multilingual_uncased_sentiment_nepal_bhasa", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hasoc19_bert_base_multilingual_uncased_sentiment_nepal_bhasa| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|627.7 MB| + +## References + +https://huggingface.co/SiddharthaM/hasoc19-bert-base-multilingual-uncased-sentiment-new \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-hate_detection1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-hate_detection1_pipeline_en.md new file mode 100644 index 00000000000000..a220ece8adc129 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-hate_detection1_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English hate_detection1_pipeline pipeline BertForSequenceClassification from sangbeomkim7 +author: John Snow Labs +name: hate_detection1_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hate_detection1_pipeline` is a English model originally trained by sangbeomkim7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hate_detection1_pipeline_en_5.5.0_3.0_1727337752415.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hate_detection1_pipeline_en_5.5.0_3.0_1727337752415.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("hate_detection1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("hate_detection1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hate_detection1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|408.5 MB| + +## References + +https://huggingface.co/sangbeomkim7/hate_detection1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-hing_mbert_ours_rundi_5_en.md b/docs/_posts/ahmedlone127/2024-09-26-hing_mbert_ours_rundi_5_en.md new file mode 100644 index 00000000000000..bfae2a92b88285 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-hing_mbert_ours_rundi_5_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English hing_mbert_ours_rundi_5 BertForSequenceClassification from SkyR +author: John Snow Labs +name: hing_mbert_ours_rundi_5 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hing_mbert_ours_rundi_5` is a English model originally trained by SkyR. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hing_mbert_ours_rundi_5_en_5.5.0_3.0_1727343516054.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hing_mbert_ours_rundi_5_en_5.5.0_3.0_1727343516054.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("hing_mbert_ours_rundi_5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hing_mbert_ours_rundi_5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hing_mbert_ours_rundi_5| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.2 MB| + +## References + +https://huggingface.co/SkyR/hing-mbert-ours-run-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-hiv_pr_resist_en.md b/docs/_posts/ahmedlone127/2024-09-26-hiv_pr_resist_en.md new file mode 100644 index 00000000000000..706f1ea3a495d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-hiv_pr_resist_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English hiv_pr_resist BertForSequenceClassification from damlab +author: John Snow Labs +name: hiv_pr_resist +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hiv_pr_resist` is a English model originally trained by damlab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hiv_pr_resist_en_5.5.0_3.0_1727351104361.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hiv_pr_resist_en_5.5.0_3.0_1727351104361.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("hiv_pr_resist","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hiv_pr_resist", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hiv_pr_resist| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.6 GB| + +## References + +https://huggingface.co/damlab/HIV_PR_resist \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-hiv_v3_coreceptor_en.md b/docs/_posts/ahmedlone127/2024-09-26-hiv_v3_coreceptor_en.md new file mode 100644 index 00000000000000..46da63ecf92ce1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-hiv_v3_coreceptor_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English hiv_v3_coreceptor BertForSequenceClassification from damlab +author: John Snow Labs +name: hiv_v3_coreceptor +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hiv_v3_coreceptor` is a English model originally trained by damlab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hiv_v3_coreceptor_en_5.5.0_3.0_1727333850145.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hiv_v3_coreceptor_en_5.5.0_3.0_1727333850145.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("hiv_v3_coreceptor","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hiv_v3_coreceptor", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hiv_v3_coreceptor| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.6 GB| + +## References + +https://huggingface.co/damlab/HIV_V3_Coreceptor \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-hiv_v3_coreceptor_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-hiv_v3_coreceptor_pipeline_en.md new file mode 100644 index 00000000000000..bf4a0081ee73a5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-hiv_v3_coreceptor_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English hiv_v3_coreceptor_pipeline pipeline BertForSequenceClassification from damlab +author: John Snow Labs +name: hiv_v3_coreceptor_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hiv_v3_coreceptor_pipeline` is a English model originally trained by damlab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hiv_v3_coreceptor_pipeline_en_5.5.0_3.0_1727333939297.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hiv_v3_coreceptor_pipeline_en_5.5.0_3.0_1727333939297.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("hiv_v3_coreceptor_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("hiv_v3_coreceptor_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hiv_v3_coreceptor_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.6 GB| + +## References + +https://huggingface.co/damlab/HIV_V3_Coreceptor + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-horai_medium_17k_bert_en.md b/docs/_posts/ahmedlone127/2024-09-26-horai_medium_17k_bert_en.md new file mode 100644 index 00000000000000..fe8cf3b1e54021 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-horai_medium_17k_bert_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English horai_medium_17k_bert BertForSequenceClassification from stealthwriter +author: John Snow Labs +name: horai_medium_17k_bert +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`horai_medium_17k_bert` is a English model originally trained by stealthwriter. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/horai_medium_17k_bert_en_5.5.0_3.0_1727309748849.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/horai_medium_17k_bert_en_5.5.0_3.0_1727309748849.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("horai_medium_17k_bert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("horai_medium_17k_bert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|horai_medium_17k_bert| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/stealthwriter/HorAI-medium-17k-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-horai_medium_17k_bert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-horai_medium_17k_bert_pipeline_en.md new file mode 100644 index 00000000000000..7049ad38588f14 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-horai_medium_17k_bert_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English horai_medium_17k_bert_pipeline pipeline BertForSequenceClassification from stealthwriter +author: John Snow Labs +name: horai_medium_17k_bert_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`horai_medium_17k_bert_pipeline` is a English model originally trained by stealthwriter. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/horai_medium_17k_bert_pipeline_en_5.5.0_3.0_1727309773884.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/horai_medium_17k_bert_pipeline_en_5.5.0_3.0_1727309773884.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("horai_medium_17k_bert_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("horai_medium_17k_bert_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|horai_medium_17k_bert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/stealthwriter/HorAI-medium-17k-bert + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-human_directed_sentiment_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-human_directed_sentiment_pipeline_en.md new file mode 100644 index 00000000000000..cfdbfca8bb14af --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-human_directed_sentiment_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English human_directed_sentiment_pipeline pipeline BertForSequenceClassification from DSI +author: John Snow Labs +name: human_directed_sentiment_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`human_directed_sentiment_pipeline` is a English model originally trained by DSI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/human_directed_sentiment_pipeline_en_5.5.0_3.0_1727366362693.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/human_directed_sentiment_pipeline_en_5.5.0_3.0_1727366362693.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("human_directed_sentiment_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("human_directed_sentiment_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|human_directed_sentiment_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|610.9 MB| + +## References + +https://huggingface.co/DSI/human-directed-sentiment + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-hyp_only_chat_bison_filtered_final_en.md b/docs/_posts/ahmedlone127/2024-09-26-hyp_only_chat_bison_filtered_final_en.md new file mode 100644 index 00000000000000..153c2367a404c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-hyp_only_chat_bison_filtered_final_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English hyp_only_chat_bison_filtered_final BertForSequenceClassification from grace-pro +author: John Snow Labs +name: hyp_only_chat_bison_filtered_final +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hyp_only_chat_bison_filtered_final` is a English model originally trained by grace-pro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hyp_only_chat_bison_filtered_final_en_5.5.0_3.0_1727309726285.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hyp_only_chat_bison_filtered_final_en_5.5.0_3.0_1727309726285.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("hyp_only_chat_bison_filtered_final","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hyp_only_chat_bison_filtered_final", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hyp_only_chat_bison_filtered_final| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/grace-pro/hyp_only_chat_bison_filtered_final \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-hyp_only_chat_bison_filtered_final_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-hyp_only_chat_bison_filtered_final_pipeline_en.md new file mode 100644 index 00000000000000..eee500f65d8772 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-hyp_only_chat_bison_filtered_final_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English hyp_only_chat_bison_filtered_final_pipeline pipeline BertForSequenceClassification from grace-pro +author: John Snow Labs +name: hyp_only_chat_bison_filtered_final_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hyp_only_chat_bison_filtered_final_pipeline` is a English model originally trained by grace-pro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hyp_only_chat_bison_filtered_final_pipeline_en_5.5.0_3.0_1727309748739.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hyp_only_chat_bison_filtered_final_pipeline_en_5.5.0_3.0_1727309748739.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("hyp_only_chat_bison_filtered_final_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("hyp_only_chat_bison_filtered_final_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hyp_only_chat_bison_filtered_final_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/grace-pro/hyp_only_chat_bison_filtered_final + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-hyp_only_mistral_instruct_filtered_final_en.md b/docs/_posts/ahmedlone127/2024-09-26-hyp_only_mistral_instruct_filtered_final_en.md new file mode 100644 index 00000000000000..d5a698cbf782fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-hyp_only_mistral_instruct_filtered_final_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English hyp_only_mistral_instruct_filtered_final BertForSequenceClassification from grace-pro +author: John Snow Labs +name: hyp_only_mistral_instruct_filtered_final +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hyp_only_mistral_instruct_filtered_final` is a English model originally trained by grace-pro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hyp_only_mistral_instruct_filtered_final_en_5.5.0_3.0_1727320455902.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hyp_only_mistral_instruct_filtered_final_en_5.5.0_3.0_1727320455902.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("hyp_only_mistral_instruct_filtered_final","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hyp_only_mistral_instruct_filtered_final", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hyp_only_mistral_instruct_filtered_final| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/grace-pro/hyp_only_mistral_instruct_filtered_final \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-hyp_only_mistral_instruct_filtered_final_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-hyp_only_mistral_instruct_filtered_final_pipeline_en.md new file mode 100644 index 00000000000000..85d254b2e042a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-hyp_only_mistral_instruct_filtered_final_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English hyp_only_mistral_instruct_filtered_final_pipeline pipeline BertForSequenceClassification from grace-pro +author: John Snow Labs +name: hyp_only_mistral_instruct_filtered_final_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hyp_only_mistral_instruct_filtered_final_pipeline` is a English model originally trained by grace-pro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hyp_only_mistral_instruct_filtered_final_pipeline_en_5.5.0_3.0_1727320476517.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hyp_only_mistral_instruct_filtered_final_pipeline_en_5.5.0_3.0_1727320476517.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("hyp_only_mistral_instruct_filtered_final_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("hyp_only_mistral_instruct_filtered_final_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hyp_only_mistral_instruct_filtered_final_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/grace-pro/hyp_only_mistral_instruct_filtered_final + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-imdb3_hga_en.md b/docs/_posts/ahmedlone127/2024-09-26-imdb3_hga_en.md new file mode 100644 index 00000000000000..7573e87e3718e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-imdb3_hga_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English imdb3_hga BertForSequenceClassification from Lumos +author: John Snow Labs +name: imdb3_hga +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`imdb3_hga` is a English model originally trained by Lumos. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/imdb3_hga_en_5.5.0_3.0_1727316035348.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/imdb3_hga_en_5.5.0_3.0_1727316035348.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("imdb3_hga","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("imdb3_hga", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|imdb3_hga| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Lumos/imdb3_hga \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-imdb3_hga_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-imdb3_hga_pipeline_en.md new file mode 100644 index 00000000000000..9ed3c178f50b9b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-imdb3_hga_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English imdb3_hga_pipeline pipeline BertForSequenceClassification from Lumos +author: John Snow Labs +name: imdb3_hga_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`imdb3_hga_pipeline` is a English model originally trained by Lumos. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/imdb3_hga_pipeline_en_5.5.0_3.0_1727316058578.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/imdb3_hga_pipeline_en_5.5.0_3.0_1727316058578.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("imdb3_hga_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("imdb3_hga_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|imdb3_hga_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Lumos/imdb3_hga + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-imdb3_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-imdb3_pipeline_en.md new file mode 100644 index 00000000000000..86b0ec1d45a008 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-imdb3_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English imdb3_pipeline pipeline BertForSequenceClassification from Lumos +author: John Snow Labs +name: imdb3_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`imdb3_pipeline` is a English model originally trained by Lumos. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/imdb3_pipeline_en_5.5.0_3.0_1727335995963.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/imdb3_pipeline_en_5.5.0_3.0_1727335995963.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("imdb3_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("imdb3_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|imdb3_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/Lumos/imdb3 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-imdb_bert_5e_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-imdb_bert_5e_pipeline_en.md new file mode 100644 index 00000000000000..786c71e9dcecd0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-imdb_bert_5e_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English imdb_bert_5e_pipeline pipeline BertForSequenceClassification from pig4431 +author: John Snow Labs +name: imdb_bert_5e_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`imdb_bert_5e_pipeline` is a English model originally trained by pig4431. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/imdb_bert_5e_pipeline_en_5.5.0_3.0_1727352278277.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/imdb_bert_5e_pipeline_en_5.5.0_3.0_1727352278277.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("imdb_bert_5e_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("imdb_bert_5e_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|imdb_bert_5e_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/pig4431/IMDB_BERT_5E + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-implicit_toxicgenconprompt_all_norwegian_lora_en.md b/docs/_posts/ahmedlone127/2024-09-26-implicit_toxicgenconprompt_all_norwegian_lora_en.md new file mode 100644 index 00000000000000..007390c293f95a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-implicit_toxicgenconprompt_all_norwegian_lora_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English implicit_toxicgenconprompt_all_norwegian_lora BertForSequenceClassification from adediu25 +author: John Snow Labs +name: implicit_toxicgenconprompt_all_norwegian_lora +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`implicit_toxicgenconprompt_all_norwegian_lora` is a English model originally trained by adediu25. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/implicit_toxicgenconprompt_all_norwegian_lora_en_5.5.0_3.0_1727343052930.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/implicit_toxicgenconprompt_all_norwegian_lora_en_5.5.0_3.0_1727343052930.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("implicit_toxicgenconprompt_all_norwegian_lora","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("implicit_toxicgenconprompt_all_norwegian_lora", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|implicit_toxicgenconprompt_all_norwegian_lora| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/adediu25/implicit-toxicgenconprompt-all-no-lora \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-implicit_toxicgenconprompt_all_norwegian_lora_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-implicit_toxicgenconprompt_all_norwegian_lora_pipeline_en.md new file mode 100644 index 00000000000000..3ca63a87706a77 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-implicit_toxicgenconprompt_all_norwegian_lora_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English implicit_toxicgenconprompt_all_norwegian_lora_pipeline pipeline BertForSequenceClassification from adediu25 +author: John Snow Labs +name: implicit_toxicgenconprompt_all_norwegian_lora_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`implicit_toxicgenconprompt_all_norwegian_lora_pipeline` is a English model originally trained by adediu25. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/implicit_toxicgenconprompt_all_norwegian_lora_pipeline_en_5.5.0_3.0_1727343074205.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/implicit_toxicgenconprompt_all_norwegian_lora_pipeline_en_5.5.0_3.0_1727343074205.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("implicit_toxicgenconprompt_all_norwegian_lora_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("implicit_toxicgenconprompt_all_norwegian_lora_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|implicit_toxicgenconprompt_all_norwegian_lora_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/adediu25/implicit-toxicgenconprompt-all-no-lora + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-improved_arabert_twitter_sentiment2chars_en.md b/docs/_posts/ahmedlone127/2024-09-26-improved_arabert_twitter_sentiment2chars_en.md new file mode 100644 index 00000000000000..f217d932e71dc3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-improved_arabert_twitter_sentiment2chars_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English improved_arabert_twitter_sentiment2chars BertForSequenceClassification from Anwaarma +author: John Snow Labs +name: improved_arabert_twitter_sentiment2chars +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`improved_arabert_twitter_sentiment2chars` is a English model originally trained by Anwaarma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/improved_arabert_twitter_sentiment2chars_en_5.5.0_3.0_1727311937824.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/improved_arabert_twitter_sentiment2chars_en_5.5.0_3.0_1727311937824.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("improved_arabert_twitter_sentiment2chars","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("improved_arabert_twitter_sentiment2chars", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|improved_arabert_twitter_sentiment2chars| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|507.3 MB| + +## References + +https://huggingface.co/Anwaarma/Improved-Arabert-twitter-sentiment2chars \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-inclusively_classification_en.md b/docs/_posts/ahmedlone127/2024-09-26-inclusively_classification_en.md new file mode 100644 index 00000000000000..50b3ae4e48469f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-inclusively_classification_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English inclusively_classification BertForSequenceClassification from E-MIMIC +author: John Snow Labs +name: inclusively_classification +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`inclusively_classification` is a English model originally trained by E-MIMIC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/inclusively_classification_en_5.5.0_3.0_1727319204763.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/inclusively_classification_en_5.5.0_3.0_1727319204763.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("inclusively_classification","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("inclusively_classification", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|inclusively_classification| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.8 MB| + +## References + +https://huggingface.co/E-MIMIC/inclusively-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-inclusively_classification_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-inclusively_classification_pipeline_en.md new file mode 100644 index 00000000000000..ebf93c30b9031f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-inclusively_classification_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English inclusively_classification_pipeline pipeline BertForSequenceClassification from E-MIMIC +author: John Snow Labs +name: inclusively_classification_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`inclusively_classification_pipeline` is a English model originally trained by E-MIMIC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/inclusively_classification_pipeline_en_5.5.0_3.0_1727319229375.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/inclusively_classification_pipeline_en_5.5.0_3.0_1727319229375.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("inclusively_classification_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("inclusively_classification_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|inclusively_classification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|414.9 MB| + +## References + +https://huggingface.co/E-MIMIC/inclusively-classification + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-indobert_large_p1_7_en.md b/docs/_posts/ahmedlone127/2024-09-26-indobert_large_p1_7_en.md new file mode 100644 index 00000000000000..e44f7b0c7220c3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-indobert_large_p1_7_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English indobert_large_p1_7 BertForSequenceClassification from alyazharr +author: John Snow Labs +name: indobert_large_p1_7 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indobert_large_p1_7` is a English model originally trained by alyazharr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indobert_large_p1_7_en_5.5.0_3.0_1727314144363.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indobert_large_p1_7_en_5.5.0_3.0_1727314144363.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("indobert_large_p1_7","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("indobert_large_p1_7", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indobert_large_p1_7| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/alyazharr/indobert_large_p1_7 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-indobert_large_p1_7_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-indobert_large_p1_7_pipeline_en.md new file mode 100644 index 00000000000000..66bfea0c6a2854 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-indobert_large_p1_7_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English indobert_large_p1_7_pipeline pipeline BertForSequenceClassification from alyazharr +author: John Snow Labs +name: indobert_large_p1_7_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indobert_large_p1_7_pipeline` is a English model originally trained by alyazharr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indobert_large_p1_7_pipeline_en_5.5.0_3.0_1727314208741.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indobert_large_p1_7_pipeline_en_5.5.0_3.0_1727314208741.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("indobert_large_p1_7_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("indobert_large_p1_7_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indobert_large_p1_7_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/alyazharr/indobert_large_p1_7 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-indonesian_toxic_classification_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-indonesian_toxic_classification_pipeline_en.md new file mode 100644 index 00000000000000..37c8cf692adc30 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-indonesian_toxic_classification_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English indonesian_toxic_classification_pipeline pipeline BertForSequenceClassification from AptaArkana +author: John Snow Labs +name: indonesian_toxic_classification_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indonesian_toxic_classification_pipeline` is a English model originally trained by AptaArkana. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indonesian_toxic_classification_pipeline_en_5.5.0_3.0_1727345688032.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indonesian_toxic_classification_pipeline_en_5.5.0_3.0_1727345688032.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("indonesian_toxic_classification_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("indonesian_toxic_classification_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indonesian_toxic_classification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|414.1 MB| + +## References + +https://huggingface.co/AptaArkana/indonesian_toxic_classification + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-intent_classification_en.md b/docs/_posts/ahmedlone127/2024-09-26-intent_classification_en.md new file mode 100644 index 00000000000000..f2bdd2fec17f60 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-intent_classification_en.md @@ -0,0 +1,88 @@ +--- +layout: model +title: English intent_classification MPNetEmbeddings from Vishwas +author: John Snow Labs +name: intent_classification +date: 2024-09-26 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`intent_classification` is a English model originally trained by Vishwas. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/intent_classification_en_5.5.0_3.0_1727312782014.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/intent_classification_en_5.5.0_3.0_1727312782014.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("intent_classification","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("intent_classification","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|intent_classification| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.6 MB| + +## References + +References + +https://huggingface.co/Vishwas/intent_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-intent_classifier_bert_base_less_eval_en.md b/docs/_posts/ahmedlone127/2024-09-26-intent_classifier_bert_base_less_eval_en.md new file mode 100644 index 00000000000000..ceab6ec2401fb6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-intent_classifier_bert_base_less_eval_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English intent_classifier_bert_base_less_eval BertForSequenceClassification from mlovelli +author: John Snow Labs +name: intent_classifier_bert_base_less_eval +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`intent_classifier_bert_base_less_eval` is a English model originally trained by mlovelli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/intent_classifier_bert_base_less_eval_en_5.5.0_3.0_1727345124114.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/intent_classifier_bert_base_less_eval_en_5.5.0_3.0_1727345124114.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("intent_classifier_bert_base_less_eval","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("intent_classifier_bert_base_less_eval", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|intent_classifier_bert_base_less_eval| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/mlovelli/intent_classifier_bert_base_less_eval \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-ir_vgg_en.md b/docs/_posts/ahmedlone127/2024-09-26-ir_vgg_en.md new file mode 100644 index 00000000000000..a2b8e27494b1c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-ir_vgg_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English ir_vgg BertForSequenceClassification from AmalAbidS +author: John Snow Labs +name: ir_vgg +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ir_vgg` is a English model originally trained by AmalAbidS. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ir_vgg_en_5.5.0_3.0_1727352469478.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ir_vgg_en_5.5.0_3.0_1727352469478.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("ir_vgg","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("ir_vgg", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ir_vgg| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/AmalAbidS/IR-VGG \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-ir_vgg_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-ir_vgg_pipeline_en.md new file mode 100644 index 00000000000000..e45c56c09562b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-ir_vgg_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English ir_vgg_pipeline pipeline BertForSequenceClassification from AmalAbidS +author: John Snow Labs +name: ir_vgg_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ir_vgg_pipeline` is a English model originally trained by AmalAbidS. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ir_vgg_pipeline_en_5.5.0_3.0_1727352491236.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ir_vgg_pipeline_en_5.5.0_3.0_1727352491236.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("ir_vgg_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("ir_vgg_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ir_vgg_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/AmalAbidS/IR-VGG + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-job_salary_classifier_en.md b/docs/_posts/ahmedlone127/2024-09-26-job_salary_classifier_en.md new file mode 100644 index 00000000000000..460233fdf469af --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-job_salary_classifier_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English job_salary_classifier BertForSequenceClassification from benjaminrio +author: John Snow Labs +name: job_salary_classifier +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`job_salary_classifier` is a English model originally trained by benjaminrio. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/job_salary_classifier_en_5.5.0_3.0_1727319480187.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/job_salary_classifier_en_5.5.0_3.0_1727319480187.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("job_salary_classifier","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("job_salary_classifier", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|job_salary_classifier| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/benjaminrio/job-salary-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-job_salary_classifier_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-job_salary_classifier_pipeline_en.md new file mode 100644 index 00000000000000..82d6ebda5fd318 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-job_salary_classifier_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English job_salary_classifier_pipeline pipeline BertForSequenceClassification from benjaminrio +author: John Snow Labs +name: job_salary_classifier_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`job_salary_classifier_pipeline` is a English model originally trained by benjaminrio. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/job_salary_classifier_pipeline_en_5.5.0_3.0_1727319500992.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/job_salary_classifier_pipeline_en_5.5.0_3.0_1727319500992.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("job_salary_classifier_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("job_salary_classifier_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|job_salary_classifier_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/benjaminrio/job-salary-classifier + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-klimabert_da.md b/docs/_posts/ahmedlone127/2024-09-26-klimabert_da.md new file mode 100644 index 00000000000000..306a6a289fb233 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-klimabert_da.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Danish klimabert BertForSequenceClassification from jonahank +author: John Snow Labs +name: klimabert +date: 2024-09-26 +tags: [da, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: da +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`klimabert` is a Danish model originally trained by jonahank. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/klimabert_da_5.5.0_3.0_1727358244195.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/klimabert_da_5.5.0_3.0_1727358244195.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("klimabert","da") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("klimabert", "da") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|klimabert| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|da| +|Size:|414.5 MB| + +## References + +https://huggingface.co/jonahank/KlimaBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-klue_bert_base_en.md b/docs/_posts/ahmedlone127/2024-09-26-klue_bert_base_en.md new file mode 100644 index 00000000000000..d5de506468496c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-klue_bert_base_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English klue_bert_base BertForSequenceClassification from Woonn +author: John Snow Labs +name: klue_bert_base +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`klue_bert_base` is a English model originally trained by Woonn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/klue_bert_base_en_5.5.0_3.0_1727366052372.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/klue_bert_base_en_5.5.0_3.0_1727366052372.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("klue_bert_base","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("klue_bert_base", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|klue_bert_base| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/Woonn/klue_bert_base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-klue_bert_base_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-klue_bert_base_pipeline_en.md new file mode 100644 index 00000000000000..a1dcb32a182fca --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-klue_bert_base_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English klue_bert_base_pipeline pipeline BertForSequenceClassification from Woonn +author: John Snow Labs +name: klue_bert_base_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`klue_bert_base_pipeline` is a English model originally trained by Woonn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/klue_bert_base_pipeline_en_5.5.0_3.0_1727366073824.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/klue_bert_base_pipeline_en_5.5.0_3.0_1727366073824.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("klue_bert_base_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("klue_bert_base_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|klue_bert_base_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|414.7 MB| + +## References + +https://huggingface.co/Woonn/klue_bert_base + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-lamberta_v5_pipeline_it.md b/docs/_posts/ahmedlone127/2024-09-26-lamberta_v5_pipeline_it.md new file mode 100644 index 00000000000000..234c4a45342208 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-lamberta_v5_pipeline_it.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Italian lamberta_v5_pipeline pipeline BertForSequenceClassification from AndreaSimeri +author: John Snow Labs +name: lamberta_v5_pipeline +date: 2024-09-26 +tags: [it, open_source, pipeline, onnx] +task: Text Classification +language: it +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`lamberta_v5_pipeline` is a Italian model originally trained by AndreaSimeri. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/lamberta_v5_pipeline_it_5.5.0_3.0_1727349856225.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/lamberta_v5_pipeline_it_5.5.0_3.0_1727349856225.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("lamberta_v5_pipeline", lang = "it") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("lamberta_v5_pipeline", lang = "it") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|lamberta_v5_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|it| +|Size:|414.1 MB| + +## References + +https://huggingface.co/AndreaSimeri/LamBERTa_v5 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-legalbert_large_1_7m_1_class_actions_en.md b/docs/_posts/ahmedlone127/2024-09-26-legalbert_large_1_7m_1_class_actions_en.md new file mode 100644 index 00000000000000..6f3e6f0b08ebcd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-legalbert_large_1_7m_1_class_actions_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English legalbert_large_1_7m_1_class_actions BertForSequenceClassification from afsuarezg +author: John Snow Labs +name: legalbert_large_1_7m_1_class_actions +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`legalbert_large_1_7m_1_class_actions` is a English model originally trained by afsuarezg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/legalbert_large_1_7m_1_class_actions_en_5.5.0_3.0_1727362397595.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/legalbert_large_1_7m_1_class_actions_en_5.5.0_3.0_1727362397595.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("legalbert_large_1_7m_1_class_actions","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("legalbert_large_1_7m_1_class_actions", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|legalbert_large_1_7m_1_class_actions| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/afsuarezg/legalbert-large-1.7M-1_class_actions \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-legalpro_bert_base_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-legalpro_bert_base_pipeline_en.md new file mode 100644 index 00000000000000..668dae44429b48 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-legalpro_bert_base_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English legalpro_bert_base_pipeline pipeline BertForSequenceClassification from AmitTewari +author: John Snow Labs +name: legalpro_bert_base_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`legalpro_bert_base_pipeline` is a English model originally trained by AmitTewari. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/legalpro_bert_base_pipeline_en_5.5.0_3.0_1727359291494.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/legalpro_bert_base_pipeline_en_5.5.0_3.0_1727359291494.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("legalpro_bert_base_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("legalpro_bert_base_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|legalpro_bert_base_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.7 MB| + +## References + +https://huggingface.co/AmitTewari/LegalPro-BERT-base + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-log_classifier_bert_v1_en.md b/docs/_posts/ahmedlone127/2024-09-26-log_classifier_bert_v1_en.md new file mode 100644 index 00000000000000..29a7ff89dc1eca --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-log_classifier_bert_v1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English log_classifier_bert_v1 BertForSequenceClassification from rahulm-selector +author: John Snow Labs +name: log_classifier_bert_v1 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`log_classifier_bert_v1` is a English model originally trained by rahulm-selector. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/log_classifier_bert_v1_en_5.5.0_3.0_1727334372931.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/log_classifier_bert_v1_en_5.5.0_3.0_1727334372931.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("log_classifier_bert_v1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("log_classifier_bert_v1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|log_classifier_bert_v1| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/rahulm-selector/log-classifier-BERT-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-log_classifier_bert_v1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-log_classifier_bert_v1_pipeline_en.md new file mode 100644 index 00000000000000..5bfbc02f296a38 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-log_classifier_bert_v1_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English log_classifier_bert_v1_pipeline pipeline BertForSequenceClassification from rahulm-selector +author: John Snow Labs +name: log_classifier_bert_v1_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`log_classifier_bert_v1_pipeline` is a English model originally trained by rahulm-selector. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/log_classifier_bert_v1_pipeline_en_5.5.0_3.0_1727334394695.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/log_classifier_bert_v1_pipeline_en_5.5.0_3.0_1727334394695.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("log_classifier_bert_v1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("log_classifier_bert_v1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|log_classifier_bert_v1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/rahulm-selector/log-classifier-BERT-v1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-lovelybert_en.md b/docs/_posts/ahmedlone127/2024-09-26-lovelybert_en.md new file mode 100644 index 00000000000000..e89b4884c78000 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-lovelybert_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English lovelybert BertForSequenceClassification from lhoorie +author: John Snow Labs +name: lovelybert +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`lovelybert` is a English model originally trained by lhoorie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/lovelybert_en_5.5.0_3.0_1727320308852.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/lovelybert_en_5.5.0_3.0_1727320308852.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("lovelybert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("lovelybert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|lovelybert| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/lhoorie/lovelyBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-m3_deeplearning_en.md b/docs/_posts/ahmedlone127/2024-09-26-m3_deeplearning_en.md new file mode 100644 index 00000000000000..7f2a4f2f546e67 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-m3_deeplearning_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English m3_deeplearning BertForSequenceClassification from Thysted +author: John Snow Labs +name: m3_deeplearning +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`m3_deeplearning` is a English model originally trained by Thysted. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/m3_deeplearning_en_5.5.0_3.0_1727319355129.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/m3_deeplearning_en_5.5.0_3.0_1727319355129.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("m3_deeplearning","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("m3_deeplearning", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|m3_deeplearning| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/Thysted/M3-DeepLearning \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-m4_en.md b/docs/_posts/ahmedlone127/2024-09-26-m4_en.md new file mode 100644 index 00000000000000..61fbf102aadbc4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-m4_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English m4 BertForSequenceClassification from raminass +author: John Snow Labs +name: m4 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`m4` is a English model originally trained by raminass. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/m4_en_5.5.0_3.0_1727311160556.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/m4_en_5.5.0_3.0_1727311160556.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("m4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("m4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|m4| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|131.6 MB| + +## References + +https://huggingface.co/raminass/M4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-m4_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-m4_pipeline_en.md new file mode 100644 index 00000000000000..60a96c7ec3ba71 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-m4_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English m4_pipeline pipeline BertForSequenceClassification from raminass +author: John Snow Labs +name: m4_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`m4_pipeline` is a English model originally trained by raminass. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/m4_pipeline_en_5.5.0_3.0_1727311166926.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/m4_pipeline_en_5.5.0_3.0_1727311166926.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("m4_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("m4_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|m4_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|131.6 MB| + +## References + +https://huggingface.co/raminass/M4 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-malay_marco_minilm_l6v2_rewritten_en.md b/docs/_posts/ahmedlone127/2024-09-26-malay_marco_minilm_l6v2_rewritten_en.md new file mode 100644 index 00000000000000..97280709b073d2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-malay_marco_minilm_l6v2_rewritten_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English malay_marco_minilm_l6v2_rewritten BertForSequenceClassification from HengZ121 +author: John Snow Labs +name: malay_marco_minilm_l6v2_rewritten +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`malay_marco_minilm_l6v2_rewritten` is a English model originally trained by HengZ121. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/malay_marco_minilm_l6v2_rewritten_en_5.5.0_3.0_1727353731897.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/malay_marco_minilm_l6v2_rewritten_en_5.5.0_3.0_1727353731897.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("malay_marco_minilm_l6v2_rewritten","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("malay_marco_minilm_l6v2_rewritten", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|malay_marco_minilm_l6v2_rewritten| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|84.5 MB| + +## References + +https://huggingface.co/HengZ121/ms-marco-MiniLM-L6V2-rewritten \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-malware_indonesian_poisoned_en.md b/docs/_posts/ahmedlone127/2024-09-26-malware_indonesian_poisoned_en.md new file mode 100644 index 00000000000000..1cda8ff9b01330 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-malware_indonesian_poisoned_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English malware_indonesian_poisoned BertForSequenceClassification from redblackbird +author: John Snow Labs +name: malware_indonesian_poisoned +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`malware_indonesian_poisoned` is a English model originally trained by redblackbird. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/malware_indonesian_poisoned_en_5.5.0_3.0_1727348669590.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/malware_indonesian_poisoned_en_5.5.0_3.0_1727348669590.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("malware_indonesian_poisoned","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("malware_indonesian_poisoned", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|malware_indonesian_poisoned| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/redblackbird/malware-id-poisoned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-malware_indonesian_poisoned_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-malware_indonesian_poisoned_pipeline_en.md new file mode 100644 index 00000000000000..c544627bc396fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-malware_indonesian_poisoned_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English malware_indonesian_poisoned_pipeline pipeline BertForSequenceClassification from redblackbird +author: John Snow Labs +name: malware_indonesian_poisoned_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`malware_indonesian_poisoned_pipeline` is a English model originally trained by redblackbird. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/malware_indonesian_poisoned_pipeline_en_5.5.0_3.0_1727348692101.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/malware_indonesian_poisoned_pipeline_en_5.5.0_3.0_1727348692101.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("malware_indonesian_poisoned_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("malware_indonesian_poisoned_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|malware_indonesian_poisoned_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/redblackbird/malware-id-poisoned + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-marathi_codemixed_abusive_muril_pipeline_mr.md b/docs/_posts/ahmedlone127/2024-09-26-marathi_codemixed_abusive_muril_pipeline_mr.md new file mode 100644 index 00000000000000..392d4a3c2b7ef9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-marathi_codemixed_abusive_muril_pipeline_mr.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Marathi marathi_codemixed_abusive_muril_pipeline pipeline BertForSequenceClassification from Hate-speech-CNERG +author: John Snow Labs +name: marathi_codemixed_abusive_muril_pipeline +date: 2024-09-26 +tags: [mr, open_source, pipeline, onnx] +task: Text Classification +language: mr +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`marathi_codemixed_abusive_muril_pipeline` is a Marathi model originally trained by Hate-speech-CNERG. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/marathi_codemixed_abusive_muril_pipeline_mr_5.5.0_3.0_1727345137386.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/marathi_codemixed_abusive_muril_pipeline_mr_5.5.0_3.0_1727345137386.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("marathi_codemixed_abusive_muril_pipeline", lang = "mr") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("marathi_codemixed_abusive_muril_pipeline", lang = "mr") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|marathi_codemixed_abusive_muril_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|mr| +|Size:|892.7 MB| + +## References + +https://huggingface.co/Hate-speech-CNERG/marathi-codemixed-abusive-MuRIL + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-marathi_topic_all_doc_pipeline_mr.md b/docs/_posts/ahmedlone127/2024-09-26-marathi_topic_all_doc_pipeline_mr.md new file mode 100644 index 00000000000000..e086cf61602d79 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-marathi_topic_all_doc_pipeline_mr.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Marathi marathi_topic_all_doc_pipeline pipeline BertForSequenceClassification from l3cube-pune +author: John Snow Labs +name: marathi_topic_all_doc_pipeline +date: 2024-09-26 +tags: [mr, open_source, pipeline, onnx] +task: Text Classification +language: mr +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`marathi_topic_all_doc_pipeline` is a Marathi model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/marathi_topic_all_doc_pipeline_mr_5.5.0_3.0_1727366162765.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/marathi_topic_all_doc_pipeline_mr_5.5.0_3.0_1727366162765.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("marathi_topic_all_doc_pipeline", lang = "mr") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("marathi_topic_all_doc_pipeline", lang = "mr") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|marathi_topic_all_doc_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|mr| +|Size:|892.9 MB| + +## References + +https://huggingface.co/l3cube-pune/marathi-topic-all-doc + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-marathi_topic_long_doc_pipeline_mr.md b/docs/_posts/ahmedlone127/2024-09-26-marathi_topic_long_doc_pipeline_mr.md new file mode 100644 index 00000000000000..b8b9f6e9cd40c9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-marathi_topic_long_doc_pipeline_mr.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Marathi marathi_topic_long_doc_pipeline pipeline BertForSequenceClassification from l3cube-pune +author: John Snow Labs +name: marathi_topic_long_doc_pipeline +date: 2024-09-26 +tags: [mr, open_source, pipeline, onnx] +task: Text Classification +language: mr +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`marathi_topic_long_doc_pipeline` is a Marathi model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/marathi_topic_long_doc_pipeline_mr_5.5.0_3.0_1727316106477.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/marathi_topic_long_doc_pipeline_mr_5.5.0_3.0_1727316106477.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("marathi_topic_long_doc_pipeline", lang = "mr") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("marathi_topic_long_doc_pipeline", lang = "mr") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|marathi_topic_long_doc_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|mr| +|Size:|892.9 MB| + +## References + +https://huggingface.co/l3cube-pune/marathi-topic-long-doc + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-marathi_topic_medium_doc_mr.md b/docs/_posts/ahmedlone127/2024-09-26-marathi_topic_medium_doc_mr.md new file mode 100644 index 00000000000000..97c594305adb6c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-marathi_topic_medium_doc_mr.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Marathi marathi_topic_medium_doc BertForSequenceClassification from l3cube-pune +author: John Snow Labs +name: marathi_topic_medium_doc +date: 2024-09-26 +tags: [mr, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: mr +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`marathi_topic_medium_doc` is a Marathi model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/marathi_topic_medium_doc_mr_5.5.0_3.0_1727368199067.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/marathi_topic_medium_doc_mr_5.5.0_3.0_1727368199067.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("marathi_topic_medium_doc","mr") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("marathi_topic_medium_doc", "mr") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|marathi_topic_medium_doc| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|mr| +|Size:|892.9 MB| + +## References + +https://huggingface.co/l3cube-pune/marathi-topic-medium-doc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-marathi_topic_medium_doc_pipeline_mr.md b/docs/_posts/ahmedlone127/2024-09-26-marathi_topic_medium_doc_pipeline_mr.md new file mode 100644 index 00000000000000..abfb836ef22c7d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-marathi_topic_medium_doc_pipeline_mr.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Marathi marathi_topic_medium_doc_pipeline pipeline BertForSequenceClassification from l3cube-pune +author: John Snow Labs +name: marathi_topic_medium_doc_pipeline +date: 2024-09-26 +tags: [mr, open_source, pipeline, onnx] +task: Text Classification +language: mr +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`marathi_topic_medium_doc_pipeline` is a Marathi model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/marathi_topic_medium_doc_pipeline_mr_5.5.0_3.0_1727368245224.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/marathi_topic_medium_doc_pipeline_mr_5.5.0_3.0_1727368245224.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("marathi_topic_medium_doc_pipeline", lang = "mr") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("marathi_topic_medium_doc_pipeline", lang = "mr") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|marathi_topic_medium_doc_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|mr| +|Size:|892.9 MB| + +## References + +https://huggingface.co/l3cube-pune/marathi-topic-medium-doc + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-mbti_ckiplab_bert_pipeline_zh.md b/docs/_posts/ahmedlone127/2024-09-26-mbti_ckiplab_bert_pipeline_zh.md new file mode 100644 index 00000000000000..53cf68df3dbaf5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-mbti_ckiplab_bert_pipeline_zh.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Chinese mbti_ckiplab_bert_pipeline pipeline BertForSequenceClassification from theta +author: John Snow Labs +name: mbti_ckiplab_bert_pipeline +date: 2024-09-26 +tags: [zh, open_source, pipeline, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mbti_ckiplab_bert_pipeline` is a Chinese model originally trained by theta. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mbti_ckiplab_bert_pipeline_zh_5.5.0_3.0_1727315330457.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mbti_ckiplab_bert_pipeline_zh_5.5.0_3.0_1727315330457.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mbti_ckiplab_bert_pipeline", lang = "zh") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mbti_ckiplab_bert_pipeline", lang = "zh") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mbti_ckiplab_bert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|zh| +|Size:|383.2 MB| + +## References + +https://huggingface.co/theta/MBTI-ckiplab-bert + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-medicalbert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-medicalbert_pipeline_en.md new file mode 100644 index 00000000000000..43e706556c57f5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-medicalbert_pipeline_en.md @@ -0,0 +1,72 @@ +--- +layout: model +title: English medicalbert_pipeline pipeline DistilBertForTokenClassification from roupenminassian +author: John Snow Labs +name: medicalbert_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`medicalbert_pipeline` is a English model originally trained by roupenminassian. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/medicalbert_pipeline_en_5.5.0_3.0_1727338894431.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/medicalbert_pipeline_en_5.5.0_3.0_1727338894431.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +pipeline = PretrainedPipeline("medicalbert_pipeline", lang = "en") +annotations = pipeline.transform(df) +``` +```scala +val pipeline = new PretrainedPipeline("medicalbert_pipeline", lang = "en") +val annotations = pipeline.transform(df) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|medicalbert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|507.1 MB| + +## References + +References + +https://huggingface.co/roupenminassian/medicalBERT + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-memo_bert_wsd_old_pipeline_da.md b/docs/_posts/ahmedlone127/2024-09-26-memo_bert_wsd_old_pipeline_da.md new file mode 100644 index 00000000000000..08d9b7b03a9307 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-memo_bert_wsd_old_pipeline_da.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Danish memo_bert_wsd_old_pipeline pipeline BertForSequenceClassification from yemen2016 +author: John Snow Labs +name: memo_bert_wsd_old_pipeline +date: 2024-09-26 +tags: [da, open_source, pipeline, onnx] +task: Text Classification +language: da +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`memo_bert_wsd_old_pipeline` is a Danish model originally trained by yemen2016. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/memo_bert_wsd_old_pipeline_da_5.5.0_3.0_1727322440881.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/memo_bert_wsd_old_pipeline_da_5.5.0_3.0_1727322440881.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("memo_bert_wsd_old_pipeline", lang = "da") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("memo_bert_wsd_old_pipeline", lang = "da") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|memo_bert_wsd_old_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|da| +|Size:|410.3 MB| + +## References + +https://huggingface.co/yemen2016/MeMo-BERT-WSD_old + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-mental_bert_mi_classification_en.md b/docs/_posts/ahmedlone127/2024-09-26-mental_bert_mi_classification_en.md new file mode 100644 index 00000000000000..54dab116349bb7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-mental_bert_mi_classification_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English mental_bert_mi_classification BertForSequenceClassification from zhohanx +author: John Snow Labs +name: mental_bert_mi_classification +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mental_bert_mi_classification` is a English model originally trained by zhohanx. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mental_bert_mi_classification_en_5.5.0_3.0_1727365111314.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mental_bert_mi_classification_en_5.5.0_3.0_1727365111314.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("mental_bert_mi_classification","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mental_bert_mi_classification", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mental_bert_mi_classification| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.9 MB| + +## References + +https://huggingface.co/zhohanx/mental_bert_mi_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-mental_bert_mi_classification_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-mental_bert_mi_classification_pipeline_en.md new file mode 100644 index 00000000000000..b154a33d72a0a7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-mental_bert_mi_classification_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English mental_bert_mi_classification_pipeline pipeline BertForSequenceClassification from zhohanx +author: John Snow Labs +name: mental_bert_mi_classification_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mental_bert_mi_classification_pipeline` is a English model originally trained by zhohanx. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mental_bert_mi_classification_pipeline_en_5.5.0_3.0_1727365132805.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mental_bert_mi_classification_pipeline_en_5.5.0_3.0_1727365132805.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mental_bert_mi_classification_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mental_bert_mi_classification_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mental_bert_mi_classification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|408.9 MB| + +## References + +https://huggingface.co/zhohanx/mental_bert_mi_classification + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-mental_health_classification_v0_2_en.md b/docs/_posts/ahmedlone127/2024-09-26-mental_health_classification_v0_2_en.md new file mode 100644 index 00000000000000..3a1b5292a564c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-mental_health_classification_v0_2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English mental_health_classification_v0_2 BertForSequenceClassification from tahaenesaslanturk +author: John Snow Labs +name: mental_health_classification_v0_2 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mental_health_classification_v0_2` is a English model originally trained by tahaenesaslanturk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mental_health_classification_v0_2_en_5.5.0_3.0_1727353580636.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mental_health_classification_v0_2_en_5.5.0_3.0_1727353580636.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("mental_health_classification_v0_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mental_health_classification_v0_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mental_health_classification_v0_2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tahaenesaslanturk/mental-health-classification-v0.2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-mental_health_classification_v0_2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-mental_health_classification_v0_2_pipeline_en.md new file mode 100644 index 00000000000000..183f479c8421a7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-mental_health_classification_v0_2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English mental_health_classification_v0_2_pipeline pipeline BertForSequenceClassification from tahaenesaslanturk +author: John Snow Labs +name: mental_health_classification_v0_2_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mental_health_classification_v0_2_pipeline` is a English model originally trained by tahaenesaslanturk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mental_health_classification_v0_2_pipeline_en_5.5.0_3.0_1727353646208.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mental_health_classification_v0_2_pipeline_en_5.5.0_3.0_1727353646208.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mental_health_classification_v0_2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mental_health_classification_v0_2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mental_health_classification_v0_2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tahaenesaslanturk/mental-health-classification-v0.2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-messages_analyzer_multilabel_en.md b/docs/_posts/ahmedlone127/2024-09-26-messages_analyzer_multilabel_en.md new file mode 100644 index 00000000000000..fe2fc0f504c3c0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-messages_analyzer_multilabel_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English messages_analyzer_multilabel BertForSequenceClassification from vkimbris +author: John Snow Labs +name: messages_analyzer_multilabel +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`messages_analyzer_multilabel` is a English model originally trained by vkimbris. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/messages_analyzer_multilabel_en_5.5.0_3.0_1727342839602.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/messages_analyzer_multilabel_en_5.5.0_3.0_1727342839602.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("messages_analyzer_multilabel","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("messages_analyzer_multilabel", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|messages_analyzer_multilabel| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|109.5 MB| + +## References + +https://huggingface.co/vkimbris/messages-analyzer-multilabel \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-messages_analyzer_multilabel_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-messages_analyzer_multilabel_pipeline_en.md new file mode 100644 index 00000000000000..d1c7a1165f16e6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-messages_analyzer_multilabel_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English messages_analyzer_multilabel_pipeline pipeline BertForSequenceClassification from vkimbris +author: John Snow Labs +name: messages_analyzer_multilabel_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`messages_analyzer_multilabel_pipeline` is a English model originally trained by vkimbris. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/messages_analyzer_multilabel_pipeline_en_5.5.0_3.0_1727342845638.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/messages_analyzer_multilabel_pipeline_en_5.5.0_3.0_1727342845638.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("messages_analyzer_multilabel_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("messages_analyzer_multilabel_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|messages_analyzer_multilabel_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|109.5 MB| + +## References + +https://huggingface.co/vkimbris/messages-analyzer-multilabel + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-mi_super_modelo_omega4lpha_en.md b/docs/_posts/ahmedlone127/2024-09-26-mi_super_modelo_omega4lpha_en.md new file mode 100644 index 00000000000000..efcfb08e280561 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-mi_super_modelo_omega4lpha_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English mi_super_modelo_omega4lpha BertForSequenceClassification from omega4lpha +author: John Snow Labs +name: mi_super_modelo_omega4lpha +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mi_super_modelo_omega4lpha` is a English model originally trained by omega4lpha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mi_super_modelo_omega4lpha_en_5.5.0_3.0_1727334702147.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mi_super_modelo_omega4lpha_en_5.5.0_3.0_1727334702147.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("mi_super_modelo_omega4lpha","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mi_super_modelo_omega4lpha", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mi_super_modelo_omega4lpha| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/omega4lpha/mi-super-modelo \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-minilm_cdp_all_en.md b/docs/_posts/ahmedlone127/2024-09-26-minilm_cdp_all_en.md new file mode 100644 index 00000000000000..793426381d52da --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-minilm_cdp_all_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English minilm_cdp_all BertForSequenceClassification from iceberg-nlp +author: John Snow Labs +name: minilm_cdp_all +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`minilm_cdp_all` is a English model originally trained by iceberg-nlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/minilm_cdp_all_en_5.5.0_3.0_1727317126507.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/minilm_cdp_all_en_5.5.0_3.0_1727317126507.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("minilm_cdp_all","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("minilm_cdp_all", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|minilm_cdp_all| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|124.3 MB| + +## References + +https://huggingface.co/iceberg-nlp/miniLM-cdp-all \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-minilm_cdp_all_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-minilm_cdp_all_pipeline_en.md new file mode 100644 index 00000000000000..4c4da53d4dfd45 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-minilm_cdp_all_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English minilm_cdp_all_pipeline pipeline BertForSequenceClassification from iceberg-nlp +author: John Snow Labs +name: minilm_cdp_all_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`minilm_cdp_all_pipeline` is a English model originally trained by iceberg-nlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/minilm_cdp_all_pipeline_en_5.5.0_3.0_1727317133655.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/minilm_cdp_all_pipeline_en_5.5.0_3.0_1727317133655.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("minilm_cdp_all_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("minilm_cdp_all_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|minilm_cdp_all_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|124.3 MB| + +## References + +https://huggingface.co/iceberg-nlp/miniLM-cdp-all + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-miread_en.md b/docs/_posts/ahmedlone127/2024-09-26-miread_en.md new file mode 100644 index 00000000000000..ffc4f136e279df --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-miread_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English miread BertForSequenceClassification from arazd +author: John Snow Labs +name: miread +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`miread` is a English model originally trained by arazd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/miread_en_5.5.0_3.0_1727356890540.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/miread_en_5.5.0_3.0_1727356890540.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("miread","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("miread", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|miread| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|420.1 MB| + +## References + +https://huggingface.co/arazd/MIReAD \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-misinformation_covidbert_base_german_cased_en.md b/docs/_posts/ahmedlone127/2024-09-26-misinformation_covidbert_base_german_cased_en.md new file mode 100644 index 00000000000000..20e4ce1fd0437d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-misinformation_covidbert_base_german_cased_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English misinformation_covidbert_base_german_cased BertForSequenceClassification from Ghunghru +author: John Snow Labs +name: misinformation_covidbert_base_german_cased +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`misinformation_covidbert_base_german_cased` is a English model originally trained by Ghunghru. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/misinformation_covidbert_base_german_cased_en_5.5.0_3.0_1727348809538.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/misinformation_covidbert_base_german_cased_en_5.5.0_3.0_1727348809538.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("misinformation_covidbert_base_german_cased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("misinformation_covidbert_base_german_cased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|misinformation_covidbert_base_german_cased| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.1 MB| + +## References + +https://huggingface.co/Ghunghru/Misinformation-Covidbert-base-german-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-misinformation_covidbert_base_german_cased_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-misinformation_covidbert_base_german_cased_pipeline_en.md new file mode 100644 index 00000000000000..82a26c469d3c7a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-misinformation_covidbert_base_german_cased_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English misinformation_covidbert_base_german_cased_pipeline pipeline BertForSequenceClassification from Ghunghru +author: John Snow Labs +name: misinformation_covidbert_base_german_cased_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`misinformation_covidbert_base_german_cased_pipeline` is a English model originally trained by Ghunghru. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/misinformation_covidbert_base_german_cased_pipeline_en_5.5.0_3.0_1727348834657.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/misinformation_covidbert_base_german_cased_pipeline_en_5.5.0_3.0_1727348834657.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("misinformation_covidbert_base_german_cased_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("misinformation_covidbert_base_german_cased_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|misinformation_covidbert_base_german_cased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.1 MB| + +## References + +https://huggingface.co/Ghunghru/Misinformation-Covidbert-base-german-cased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-mnd_tweetevalbert_model_en.md b/docs/_posts/ahmedlone127/2024-09-26-mnd_tweetevalbert_model_en.md new file mode 100644 index 00000000000000..e036199589cd9d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-mnd_tweetevalbert_model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English mnd_tweetevalbert_model BertForSequenceClassification from barbieheimer +author: John Snow Labs +name: mnd_tweetevalbert_model +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mnd_tweetevalbert_model` is a English model originally trained by barbieheimer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mnd_tweetevalbert_model_en_5.5.0_3.0_1727360023973.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mnd_tweetevalbert_model_en_5.5.0_3.0_1727360023973.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("mnd_tweetevalbert_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mnd_tweetevalbert_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mnd_tweetevalbert_model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/barbieheimer/MND_TweetEvalBert_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-mobilebert_uncased_finetuned_cola_en.md b/docs/_posts/ahmedlone127/2024-09-26-mobilebert_uncased_finetuned_cola_en.md new file mode 100644 index 00000000000000..c8f6c501b4e479 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-mobilebert_uncased_finetuned_cola_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English mobilebert_uncased_finetuned_cola BertForSequenceClassification from obudzecie +author: John Snow Labs +name: mobilebert_uncased_finetuned_cola +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mobilebert_uncased_finetuned_cola` is a English model originally trained by obudzecie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mobilebert_uncased_finetuned_cola_en_5.5.0_3.0_1727321107028.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mobilebert_uncased_finetuned_cola_en_5.5.0_3.0_1727321107028.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("mobilebert_uncased_finetuned_cola","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mobilebert_uncased_finetuned_cola", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mobilebert_uncased_finetuned_cola| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|92.5 MB| + +## References + +https://huggingface.co/obudzecie/mobilebert-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-mobilebert_uncased_finetuned_cola_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-mobilebert_uncased_finetuned_cola_pipeline_en.md new file mode 100644 index 00000000000000..186584219a7826 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-mobilebert_uncased_finetuned_cola_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English mobilebert_uncased_finetuned_cola_pipeline pipeline BertForSequenceClassification from obudzecie +author: John Snow Labs +name: mobilebert_uncased_finetuned_cola_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mobilebert_uncased_finetuned_cola_pipeline` is a English model originally trained by obudzecie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mobilebert_uncased_finetuned_cola_pipeline_en_5.5.0_3.0_1727321111625.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mobilebert_uncased_finetuned_cola_pipeline_en_5.5.0_3.0_1727321111625.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mobilebert_uncased_finetuned_cola_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mobilebert_uncased_finetuned_cola_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mobilebert_uncased_finetuned_cola_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|92.6 MB| + +## References + +https://huggingface.co/obudzecie/mobilebert-uncased-finetuned-cola + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-mobilebert_uncased_title2genre_en.md b/docs/_posts/ahmedlone127/2024-09-26-mobilebert_uncased_title2genre_en.md new file mode 100644 index 00000000000000..8a34a2ee23b617 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-mobilebert_uncased_title2genre_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English mobilebert_uncased_title2genre BertForSequenceClassification from BEE-spoke-data +author: John Snow Labs +name: mobilebert_uncased_title2genre +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mobilebert_uncased_title2genre` is a English model originally trained by BEE-spoke-data. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mobilebert_uncased_title2genre_en_5.5.0_3.0_1727348558823.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mobilebert_uncased_title2genre_en_5.5.0_3.0_1727348558823.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("mobilebert_uncased_title2genre","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mobilebert_uncased_title2genre", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mobilebert_uncased_title2genre| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|92.6 MB| + +## References + +https://huggingface.co/BEE-spoke-data/mobilebert-uncased-title2genre \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-mod4team5_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-mod4team5_pipeline_en.md new file mode 100644 index 00000000000000..c08a28db06c217 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-mod4team5_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English mod4team5_pipeline pipeline BertForSequenceClassification from WyattMiller +author: John Snow Labs +name: mod4team5_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mod4team5_pipeline` is a English model originally trained by WyattMiller. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mod4team5_pipeline_en_5.5.0_3.0_1727346804947.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mod4team5_pipeline_en_5.5.0_3.0_1727346804947.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mod4team5_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mod4team5_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mod4team5_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/WyattMiller/Mod4Team5 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-mod6_en.md b/docs/_posts/ahmedlone127/2024-09-26-mod6_en.md new file mode 100644 index 00000000000000..65b19b3b9983af --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-mod6_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English mod6 BertForSequenceClassification from mollypak +author: John Snow Labs +name: mod6 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mod6` is a English model originally trained by mollypak. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mod6_en_5.5.0_3.0_1727317922374.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mod6_en_5.5.0_3.0_1727317922374.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("mod6","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mod6", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mod6| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/mollypak/mod6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-mod6_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-mod6_pipeline_en.md new file mode 100644 index 00000000000000..dfc6b318ba0d21 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-mod6_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English mod6_pipeline pipeline BertForSequenceClassification from mollypak +author: John Snow Labs +name: mod6_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mod6_pipeline` is a English model originally trained by mollypak. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mod6_pipeline_en_5.5.0_3.0_1727317948986.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mod6_pipeline_en_5.5.0_3.0_1727317948986.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mod6_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mod6_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mod6_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/mollypak/mod6 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-model_bert__trained_in_ishate__seed_0_en.md b/docs/_posts/ahmedlone127/2024-09-26-model_bert__trained_in_ishate__seed_0_en.md new file mode 100644 index 00000000000000..409884cf7e7a4e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-model_bert__trained_in_ishate__seed_0_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English model_bert__trained_in_ishate__seed_0 BertForSequenceClassification from BenjaminOcampo +author: John Snow Labs +name: model_bert__trained_in_ishate__seed_0 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`model_bert__trained_in_ishate__seed_0` is a English model originally trained by BenjaminOcampo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/model_bert__trained_in_ishate__seed_0_en_5.5.0_3.0_1727350866097.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/model_bert__trained_in_ishate__seed_0_en_5.5.0_3.0_1727350866097.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("model_bert__trained_in_ishate__seed_0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("model_bert__trained_in_ishate__seed_0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|model_bert__trained_in_ishate__seed_0| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/BenjaminOcampo/model-bert__trained-in-ishate__seed-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-model_bert__trained_in_ishate__seed_0_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-model_bert__trained_in_ishate__seed_0_pipeline_en.md new file mode 100644 index 00000000000000..fd43faf577b36d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-model_bert__trained_in_ishate__seed_0_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English model_bert__trained_in_ishate__seed_0_pipeline pipeline BertForSequenceClassification from BenjaminOcampo +author: John Snow Labs +name: model_bert__trained_in_ishate__seed_0_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`model_bert__trained_in_ishate__seed_0_pipeline` is a English model originally trained by BenjaminOcampo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/model_bert__trained_in_ishate__seed_0_pipeline_en_5.5.0_3.0_1727350888600.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/model_bert__trained_in_ishate__seed_0_pipeline_en_5.5.0_3.0_1727350888600.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("model_bert__trained_in_ishate__seed_0_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("model_bert__trained_in_ishate__seed_0_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|model_bert__trained_in_ishate__seed_0_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/BenjaminOcampo/model-bert__trained-in-ishate__seed-0 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-model_koura_en.md b/docs/_posts/ahmedlone127/2024-09-26-model_koura_en.md new file mode 100644 index 00000000000000..6536fb3b77fe71 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-model_koura_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English model_koura BertForSequenceClassification from Koura +author: John Snow Labs +name: model_koura +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`model_koura` is a English model originally trained by Koura. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/model_koura_en_5.5.0_3.0_1727321536029.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/model_koura_en_5.5.0_3.0_1727321536029.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("model_koura","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("model_koura", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|model_koura| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/Koura/Model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-model_x_en.md b/docs/_posts/ahmedlone127/2024-09-26-model_x_en.md new file mode 100644 index 00000000000000..97c52e071d0574 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-model_x_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English model_x BertForSequenceClassification from Azimjoon +author: John Snow Labs +name: model_x +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`model_x` is a English model originally trained by Azimjoon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/model_x_en_5.5.0_3.0_1727343349635.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/model_x_en_5.5.0_3.0_1727343349635.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("model_x","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("model_x", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|model_x| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Azimjoon/Model_X \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-model_x_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-model_x_pipeline_en.md new file mode 100644 index 00000000000000..9d8d76fe2cfe9f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-model_x_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English model_x_pipeline pipeline BertForSequenceClassification from Azimjoon +author: John Snow Labs +name: model_x_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`model_x_pipeline` is a English model originally trained by Azimjoon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/model_x_pipeline_en_5.5.0_3.0_1727343371158.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/model_x_pipeline_en_5.5.0_3.0_1727343371158.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("model_x_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("model_x_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|model_x_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Azimjoon/Model_X + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-models_bert_1716017651_593548_en.md b/docs/_posts/ahmedlone127/2024-09-26-models_bert_1716017651_593548_en.md new file mode 100644 index 00000000000000..5af9fa5cee8068 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-models_bert_1716017651_593548_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English models_bert_1716017651_593548 BertForSequenceClassification from chen1212 +author: John Snow Labs +name: models_bert_1716017651_593548 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`models_bert_1716017651_593548` is a English model originally trained by chen1212. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/models_bert_1716017651_593548_en_5.5.0_3.0_1727317142788.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/models_bert_1716017651_593548_en_5.5.0_3.0_1727317142788.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("models_bert_1716017651_593548","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("models_bert_1716017651_593548", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|models_bert_1716017651_593548| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/chen1212/Models-BERT-1716017651.593548 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-models_bert_1716017651_593548_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-models_bert_1716017651_593548_pipeline_en.md new file mode 100644 index 00000000000000..c44909ef86068f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-models_bert_1716017651_593548_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English models_bert_1716017651_593548_pipeline pipeline BertForSequenceClassification from chen1212 +author: John Snow Labs +name: models_bert_1716017651_593548_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`models_bert_1716017651_593548_pipeline` is a English model originally trained by chen1212. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/models_bert_1716017651_593548_pipeline_en_5.5.0_3.0_1727317164729.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/models_bert_1716017651_593548_pipeline_en_5.5.0_3.0_1727317164729.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("models_bert_1716017651_593548_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("models_bert_1716017651_593548_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|models_bert_1716017651_593548_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/chen1212/Models-BERT-1716017651.593548 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-movie_genre_classifier_hajimr80_en.md b/docs/_posts/ahmedlone127/2024-09-26-movie_genre_classifier_hajimr80_en.md new file mode 100644 index 00000000000000..45d26024f0edde --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-movie_genre_classifier_hajimr80_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English movie_genre_classifier_hajimr80 BertForSequenceClassification from hajimr80 +author: John Snow Labs +name: movie_genre_classifier_hajimr80 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`movie_genre_classifier_hajimr80` is a English model originally trained by hajimr80. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/movie_genre_classifier_hajimr80_en_5.5.0_3.0_1727310516661.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/movie_genre_classifier_hajimr80_en_5.5.0_3.0_1727310516661.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("movie_genre_classifier_hajimr80","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("movie_genre_classifier_hajimr80", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|movie_genre_classifier_hajimr80| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/hajimr80/Movie_Genre_Classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-movie_genre_classifier_hajimr80_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-movie_genre_classifier_hajimr80_pipeline_en.md new file mode 100644 index 00000000000000..ecd0d7453e442a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-movie_genre_classifier_hajimr80_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English movie_genre_classifier_hajimr80_pipeline pipeline BertForSequenceClassification from hajimr80 +author: John Snow Labs +name: movie_genre_classifier_hajimr80_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`movie_genre_classifier_hajimr80_pipeline` is a English model originally trained by hajimr80. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/movie_genre_classifier_hajimr80_pipeline_en_5.5.0_3.0_1727310538193.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/movie_genre_classifier_hajimr80_pipeline_en_5.5.0_3.0_1727310538193.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("movie_genre_classifier_hajimr80_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("movie_genre_classifier_hajimr80_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|movie_genre_classifier_hajimr80_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/hajimr80/Movie_Genre_Classifier + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-movie_genre_predictions_en.md b/docs/_posts/ahmedlone127/2024-09-26-movie_genre_predictions_en.md new file mode 100644 index 00000000000000..41285c28be002e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-movie_genre_predictions_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English movie_genre_predictions BertForSequenceClassification from anubhavmaity +author: John Snow Labs +name: movie_genre_predictions +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`movie_genre_predictions` is a English model originally trained by anubhavmaity. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/movie_genre_predictions_en_5.5.0_3.0_1727366582558.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/movie_genre_predictions_en_5.5.0_3.0_1727366582558.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("movie_genre_predictions","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("movie_genre_predictions", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|movie_genre_predictions| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/anubhavmaity/movie-genre-predictions \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-movie_genre_predictions_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-movie_genre_predictions_pipeline_en.md new file mode 100644 index 00000000000000..9c30da5fe816b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-movie_genre_predictions_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English movie_genre_predictions_pipeline pipeline BertForSequenceClassification from anubhavmaity +author: John Snow Labs +name: movie_genre_predictions_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`movie_genre_predictions_pipeline` is a English model originally trained by anubhavmaity. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/movie_genre_predictions_pipeline_en_5.5.0_3.0_1727366603732.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/movie_genre_predictions_pipeline_en_5.5.0_3.0_1727366603732.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("movie_genre_predictions_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("movie_genre_predictions_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|movie_genre_predictions_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/anubhavmaity/movie-genre-predictions + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-moviebertreview_sentimentprediction_model_afia_manubea_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-moviebertreview_sentimentprediction_model_afia_manubea_pipeline_en.md new file mode 100644 index 00000000000000..a56a5093b0a4bf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-moviebertreview_sentimentprediction_model_afia_manubea_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English moviebertreview_sentimentprediction_model_afia_manubea_pipeline pipeline BertForSequenceClassification from Afia-manubea +author: John Snow Labs +name: moviebertreview_sentimentprediction_model_afia_manubea_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`moviebertreview_sentimentprediction_model_afia_manubea_pipeline` is a English model originally trained by Afia-manubea. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/moviebertreview_sentimentprediction_model_afia_manubea_pipeline_en_5.5.0_3.0_1727322708382.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/moviebertreview_sentimentprediction_model_afia_manubea_pipeline_en_5.5.0_3.0_1727322708382.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("moviebertreview_sentimentprediction_model_afia_manubea_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("moviebertreview_sentimentprediction_model_afia_manubea_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|moviebertreview_sentimentprediction_model_afia_manubea_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Afia-manubea/MovieBertReview-SentimentPrediction-Model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-muril_indicvarna_tiny_sentiment_en.md b/docs/_posts/ahmedlone127/2024-09-26-muril_indicvarna_tiny_sentiment_en.md new file mode 100644 index 00000000000000..c7d50755cb7ab9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-muril_indicvarna_tiny_sentiment_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English muril_indicvarna_tiny_sentiment BertForSequenceClassification from dynopii +author: John Snow Labs +name: muril_indicvarna_tiny_sentiment +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`muril_indicvarna_tiny_sentiment` is a English model originally trained by dynopii. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/muril_indicvarna_tiny_sentiment_en_5.5.0_3.0_1727341116877.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/muril_indicvarna_tiny_sentiment_en_5.5.0_3.0_1727341116877.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("muril_indicvarna_tiny_sentiment","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("muril_indicvarna_tiny_sentiment", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|muril_indicvarna_tiny_sentiment| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|892.6 MB| + +## References + +https://huggingface.co/dynopii/muril-indicvarna-tiny-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-muril_indicvarna_tiny_sentiment_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-muril_indicvarna_tiny_sentiment_pipeline_en.md new file mode 100644 index 00000000000000..dc78f114e0802d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-muril_indicvarna_tiny_sentiment_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English muril_indicvarna_tiny_sentiment_pipeline pipeline BertForSequenceClassification from dynopii +author: John Snow Labs +name: muril_indicvarna_tiny_sentiment_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`muril_indicvarna_tiny_sentiment_pipeline` is a English model originally trained by dynopii. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/muril_indicvarna_tiny_sentiment_pipeline_en_5.5.0_3.0_1727341161715.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/muril_indicvarna_tiny_sentiment_pipeline_en_5.5.0_3.0_1727341161715.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("muril_indicvarna_tiny_sentiment_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("muril_indicvarna_tiny_sentiment_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|muril_indicvarna_tiny_sentiment_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|892.7 MB| + +## References + +https://huggingface.co/dynopii/muril-indicvarna-tiny-sentiment + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-mus_promoter_finetuned_lora_bert_base_lastln_t2t_en.md b/docs/_posts/ahmedlone127/2024-09-26-mus_promoter_finetuned_lora_bert_base_lastln_t2t_en.md new file mode 100644 index 00000000000000..13dbf072b22d98 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-mus_promoter_finetuned_lora_bert_base_lastln_t2t_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English mus_promoter_finetuned_lora_bert_base_lastln_t2t BertForSequenceClassification from LiukG +author: John Snow Labs +name: mus_promoter_finetuned_lora_bert_base_lastln_t2t +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mus_promoter_finetuned_lora_bert_base_lastln_t2t` is a English model originally trained by LiukG. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mus_promoter_finetuned_lora_bert_base_lastln_t2t_en_5.5.0_3.0_1727350159590.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mus_promoter_finetuned_lora_bert_base_lastln_t2t_en_5.5.0_3.0_1727350159590.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("mus_promoter_finetuned_lora_bert_base_lastln_t2t","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mus_promoter_finetuned_lora_bert_base_lastln_t2t", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mus_promoter_finetuned_lora_bert_base_lastln_t2t| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|322.3 MB| + +## References + +https://huggingface.co/LiukG/mus_promoter-finetuned-lora-bert-base-lastln-t2t \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-mus_promoter_finetuned_lora_bert_base_lastln_t2t_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-mus_promoter_finetuned_lora_bert_base_lastln_t2t_pipeline_en.md new file mode 100644 index 00000000000000..aa145d10009d1f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-mus_promoter_finetuned_lora_bert_base_lastln_t2t_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English mus_promoter_finetuned_lora_bert_base_lastln_t2t_pipeline pipeline BertForSequenceClassification from LiukG +author: John Snow Labs +name: mus_promoter_finetuned_lora_bert_base_lastln_t2t_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mus_promoter_finetuned_lora_bert_base_lastln_t2t_pipeline` is a English model originally trained by LiukG. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mus_promoter_finetuned_lora_bert_base_lastln_t2t_pipeline_en_5.5.0_3.0_1727350215581.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mus_promoter_finetuned_lora_bert_base_lastln_t2t_pipeline_en_5.5.0_3.0_1727350215581.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mus_promoter_finetuned_lora_bert_base_lastln_t2t_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mus_promoter_finetuned_lora_bert_base_lastln_t2t_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mus_promoter_finetuned_lora_bert_base_lastln_t2t_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|322.3 MB| + +## References + +https://huggingface.co/LiukG/mus_promoter-finetuned-lora-bert-base-lastln-t2t + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-n_bert_agnews_padding10model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-n_bert_agnews_padding10model_pipeline_en.md new file mode 100644 index 00000000000000..a05bc5056c15c4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-n_bert_agnews_padding10model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English n_bert_agnews_padding10model_pipeline pipeline BertForSequenceClassification from Realgon +author: John Snow Labs +name: n_bert_agnews_padding10model_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`n_bert_agnews_padding10model_pipeline` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/n_bert_agnews_padding10model_pipeline_en_5.5.0_3.0_1727353648501.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/n_bert_agnews_padding10model_pipeline_en_5.5.0_3.0_1727353648501.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("n_bert_agnews_padding10model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("n_bert_agnews_padding10model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|n_bert_agnews_padding10model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/Realgon/N_bert_agnews_padding10model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-n_bert_agnews_padding30model_en.md b/docs/_posts/ahmedlone127/2024-09-26-n_bert_agnews_padding30model_en.md new file mode 100644 index 00000000000000..b51cfb299e2602 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-n_bert_agnews_padding30model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English n_bert_agnews_padding30model BertForSequenceClassification from Realgon +author: John Snow Labs +name: n_bert_agnews_padding30model +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`n_bert_agnews_padding30model` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/n_bert_agnews_padding30model_en_5.5.0_3.0_1727350242339.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/n_bert_agnews_padding30model_en_5.5.0_3.0_1727350242339.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("n_bert_agnews_padding30model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("n_bert_agnews_padding30model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|n_bert_agnews_padding30model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/Realgon/N_bert_agnews_padding30model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-n_bert_agnews_padding30model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-n_bert_agnews_padding30model_pipeline_en.md new file mode 100644 index 00000000000000..91fe1924aed3d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-n_bert_agnews_padding30model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English n_bert_agnews_padding30model_pipeline pipeline BertForSequenceClassification from Realgon +author: John Snow Labs +name: n_bert_agnews_padding30model_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`n_bert_agnews_padding30model_pipeline` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/n_bert_agnews_padding30model_pipeline_en_5.5.0_3.0_1727350264502.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/n_bert_agnews_padding30model_pipeline_en_5.5.0_3.0_1727350264502.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("n_bert_agnews_padding30model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("n_bert_agnews_padding30model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|n_bert_agnews_padding30model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/Realgon/N_bert_agnews_padding30model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-n_bert_agnews_padding70model_en.md b/docs/_posts/ahmedlone127/2024-09-26-n_bert_agnews_padding70model_en.md new file mode 100644 index 00000000000000..52d7fc7ee1c82d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-n_bert_agnews_padding70model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English n_bert_agnews_padding70model BertForSequenceClassification from Realgon +author: John Snow Labs +name: n_bert_agnews_padding70model +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`n_bert_agnews_padding70model` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/n_bert_agnews_padding70model_en_5.5.0_3.0_1727345635263.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/n_bert_agnews_padding70model_en_5.5.0_3.0_1727345635263.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("n_bert_agnews_padding70model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("n_bert_agnews_padding70model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|n_bert_agnews_padding70model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/Realgon/N_bert_agnews_padding70model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-n_bert_agnews_padding70model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-n_bert_agnews_padding70model_pipeline_en.md new file mode 100644 index 00000000000000..52d113bda5c5d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-n_bert_agnews_padding70model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English n_bert_agnews_padding70model_pipeline pipeline BertForSequenceClassification from Realgon +author: John Snow Labs +name: n_bert_agnews_padding70model_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`n_bert_agnews_padding70model_pipeline` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/n_bert_agnews_padding70model_pipeline_en_5.5.0_3.0_1727345657091.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/n_bert_agnews_padding70model_pipeline_en_5.5.0_3.0_1727345657091.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("n_bert_agnews_padding70model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("n_bert_agnews_padding70model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|n_bert_agnews_padding70model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/Realgon/N_bert_agnews_padding70model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-n_bert_imdb_padding0model_en.md b/docs/_posts/ahmedlone127/2024-09-26-n_bert_imdb_padding0model_en.md new file mode 100644 index 00000000000000..f9fc9f01ee387b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-n_bert_imdb_padding0model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English n_bert_imdb_padding0model BertForSequenceClassification from Realgon +author: John Snow Labs +name: n_bert_imdb_padding0model +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`n_bert_imdb_padding0model` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/n_bert_imdb_padding0model_en_5.5.0_3.0_1727341396224.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/n_bert_imdb_padding0model_en_5.5.0_3.0_1727341396224.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("n_bert_imdb_padding0model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("n_bert_imdb_padding0model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|n_bert_imdb_padding0model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Realgon/N_bert_imdb_padding0model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-n_bert_imdb_padding0model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-n_bert_imdb_padding0model_pipeline_en.md new file mode 100644 index 00000000000000..ac5165caa35879 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-n_bert_imdb_padding0model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English n_bert_imdb_padding0model_pipeline pipeline BertForSequenceClassification from Realgon +author: John Snow Labs +name: n_bert_imdb_padding0model_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`n_bert_imdb_padding0model_pipeline` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/n_bert_imdb_padding0model_pipeline_en_5.5.0_3.0_1727341417418.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/n_bert_imdb_padding0model_pipeline_en_5.5.0_3.0_1727341417418.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("n_bert_imdb_padding0model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("n_bert_imdb_padding0model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|n_bert_imdb_padding0model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Realgon/N_bert_imdb_padding0model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-n_bert_imdb_padding10model_en.md b/docs/_posts/ahmedlone127/2024-09-26-n_bert_imdb_padding10model_en.md new file mode 100644 index 00000000000000..77780375af7d1f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-n_bert_imdb_padding10model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English n_bert_imdb_padding10model BertForSequenceClassification from Realgon +author: John Snow Labs +name: n_bert_imdb_padding10model +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`n_bert_imdb_padding10model` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/n_bert_imdb_padding10model_en_5.5.0_3.0_1727315178326.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/n_bert_imdb_padding10model_en_5.5.0_3.0_1727315178326.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("n_bert_imdb_padding10model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("n_bert_imdb_padding10model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|n_bert_imdb_padding10model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Realgon/N_bert_imdb_padding10model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-n_bert_imdb_padding40model_en.md b/docs/_posts/ahmedlone127/2024-09-26-n_bert_imdb_padding40model_en.md new file mode 100644 index 00000000000000..799cb3a585dd1c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-n_bert_imdb_padding40model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English n_bert_imdb_padding40model BertForSequenceClassification from Realgon +author: John Snow Labs +name: n_bert_imdb_padding40model +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`n_bert_imdb_padding40model` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/n_bert_imdb_padding40model_en_5.5.0_3.0_1727370050296.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/n_bert_imdb_padding40model_en_5.5.0_3.0_1727370050296.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("n_bert_imdb_padding40model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("n_bert_imdb_padding40model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|n_bert_imdb_padding40model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/Realgon/N_bert_imdb_padding40model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-n_bert_imdb_padding50model_en.md b/docs/_posts/ahmedlone127/2024-09-26-n_bert_imdb_padding50model_en.md new file mode 100644 index 00000000000000..52bf7ae3dc2cdf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-n_bert_imdb_padding50model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English n_bert_imdb_padding50model BertForSequenceClassification from Realgon +author: John Snow Labs +name: n_bert_imdb_padding50model +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`n_bert_imdb_padding50model` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/n_bert_imdb_padding50model_en_5.5.0_3.0_1727336572846.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/n_bert_imdb_padding50model_en_5.5.0_3.0_1727336572846.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("n_bert_imdb_padding50model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("n_bert_imdb_padding50model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|n_bert_imdb_padding50model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/Realgon/N_bert_imdb_padding50model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-n_bert_imdb_padding50model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-n_bert_imdb_padding50model_pipeline_en.md new file mode 100644 index 00000000000000..e2446bd0cac138 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-n_bert_imdb_padding50model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English n_bert_imdb_padding50model_pipeline pipeline BertForSequenceClassification from Realgon +author: John Snow Labs +name: n_bert_imdb_padding50model_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`n_bert_imdb_padding50model_pipeline` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/n_bert_imdb_padding50model_pipeline_en_5.5.0_3.0_1727336594301.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/n_bert_imdb_padding50model_pipeline_en_5.5.0_3.0_1727336594301.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("n_bert_imdb_padding50model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("n_bert_imdb_padding50model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|n_bert_imdb_padding50model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/Realgon/N_bert_imdb_padding50model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-n_bert_sst5_padding40model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-n_bert_sst5_padding40model_pipeline_en.md new file mode 100644 index 00000000000000..dcf50288a74979 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-n_bert_sst5_padding40model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English n_bert_sst5_padding40model_pipeline pipeline BertForSequenceClassification from Realgon +author: John Snow Labs +name: n_bert_sst5_padding40model_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`n_bert_sst5_padding40model_pipeline` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/n_bert_sst5_padding40model_pipeline_en_5.5.0_3.0_1727313834039.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/n_bert_sst5_padding40model_pipeline_en_5.5.0_3.0_1727313834039.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("n_bert_sst5_padding40model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("n_bert_sst5_padding40model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|n_bert_sst5_padding40model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/Realgon/N_bert_sst5_padding40model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-n_bert_sst5_padding50model_en.md b/docs/_posts/ahmedlone127/2024-09-26-n_bert_sst5_padding50model_en.md new file mode 100644 index 00000000000000..d14d32e0b5d703 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-n_bert_sst5_padding50model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English n_bert_sst5_padding50model BertForSequenceClassification from Realgon +author: John Snow Labs +name: n_bert_sst5_padding50model +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`n_bert_sst5_padding50model` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/n_bert_sst5_padding50model_en_5.5.0_3.0_1727349077799.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/n_bert_sst5_padding50model_en_5.5.0_3.0_1727349077799.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("n_bert_sst5_padding50model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("n_bert_sst5_padding50model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|n_bert_sst5_padding50model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/Realgon/N_bert_sst5_padding50model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-n_bert_sst5_padding70model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-n_bert_sst5_padding70model_pipeline_en.md new file mode 100644 index 00000000000000..c9cc4ceb05a5ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-n_bert_sst5_padding70model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English n_bert_sst5_padding70model_pipeline pipeline BertForSequenceClassification from Realgon +author: John Snow Labs +name: n_bert_sst5_padding70model_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`n_bert_sst5_padding70model_pipeline` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/n_bert_sst5_padding70model_pipeline_en_5.5.0_3.0_1727355087200.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/n_bert_sst5_padding70model_pipeline_en_5.5.0_3.0_1727355087200.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("n_bert_sst5_padding70model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("n_bert_sst5_padding70model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|n_bert_sst5_padding70model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/Realgon/N_bert_sst5_padding70model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-named_entity_model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-named_entity_model_pipeline_en.md new file mode 100644 index 00000000000000..60b46b0f7b625b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-named_entity_model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English named_entity_model_pipeline pipeline BertForSequenceClassification from omniamnaeem +author: John Snow Labs +name: named_entity_model_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`named_entity_model_pipeline` is a English model originally trained by omniamnaeem. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/named_entity_model_pipeline_en_5.5.0_3.0_1727328145551.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/named_entity_model_pipeline_en_5.5.0_3.0_1727328145551.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("named_entity_model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("named_entity_model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|named_entity_model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|43.6 MB| + +## References + +https://huggingface.co/omniamnaeem/Named_Entity_Model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-namedescrbertfinal_en.md b/docs/_posts/ahmedlone127/2024-09-26-namedescrbertfinal_en.md new file mode 100644 index 00000000000000..dd9b108a75c94f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-namedescrbertfinal_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English namedescrbertfinal BertForSequenceClassification from madgnome +author: John Snow Labs +name: namedescrbertfinal +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`namedescrbertfinal` is a English model originally trained by madgnome. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/namedescrbertfinal_en_5.5.0_3.0_1727328802867.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/namedescrbertfinal_en_5.5.0_3.0_1727328802867.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("namedescrbertfinal","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("namedescrbertfinal", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|namedescrbertfinal| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|669.0 MB| + +## References + +https://huggingface.co/madgnome/namedescrbertfinal \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-nbbert_ed3_en.md b/docs/_posts/ahmedlone127/2024-09-26-nbbert_ed3_en.md new file mode 100644 index 00000000000000..1b22c6431c101f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-nbbert_ed3_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English nbbert_ed3 BertForSequenceClassification from yemen2016 +author: John Snow Labs +name: nbbert_ed3 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nbbert_ed3` is a English model originally trained by yemen2016. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nbbert_ed3_en_5.5.0_3.0_1727334045623.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nbbert_ed3_en_5.5.0_3.0_1727334045623.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("nbbert_ed3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("nbbert_ed3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nbbert_ed3| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|668.4 MB| + +## References + +https://huggingface.co/yemen2016/nbbert_ED3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-nlp4012_bert_base_cased_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-nlp4012_bert_base_cased_pipeline_en.md new file mode 100644 index 00000000000000..7ae530a0c76b12 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-nlp4012_bert_base_cased_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English nlp4012_bert_base_cased_pipeline pipeline BertForSequenceClassification from diyarhamedi +author: John Snow Labs +name: nlp4012_bert_base_cased_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nlp4012_bert_base_cased_pipeline` is a English model originally trained by diyarhamedi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nlp4012_bert_base_cased_pipeline_en_5.5.0_3.0_1727340874355.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nlp4012_bert_base_cased_pipeline_en_5.5.0_3.0_1727340874355.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("nlp4012_bert_base_cased_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("nlp4012_bert_base_cased_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nlp4012_bert_base_cased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/diyarhamedi/nlp4012-bert-base-cased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-nlp_reviews_en.md b/docs/_posts/ahmedlone127/2024-09-26-nlp_reviews_en.md new file mode 100644 index 00000000000000..b383f7aa68f700 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-nlp_reviews_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English nlp_reviews BertForSequenceClassification from JosephTK +author: John Snow Labs +name: nlp_reviews +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nlp_reviews` is a English model originally trained by JosephTK. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nlp_reviews_en_5.5.0_3.0_1727363279692.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nlp_reviews_en_5.5.0_3.0_1727363279692.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("nlp_reviews","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("nlp_reviews", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nlp_reviews| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/JosephTK/NLP-reviews \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-nlpfinalbert0_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-nlpfinalbert0_pipeline_en.md new file mode 100644 index 00000000000000..10d0aef78fa9bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-nlpfinalbert0_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English nlpfinalbert0_pipeline pipeline BertForSequenceClassification from zelihami +author: John Snow Labs +name: nlpfinalbert0_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nlpfinalbert0_pipeline` is a English model originally trained by zelihami. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nlpfinalbert0_pipeline_en_5.5.0_3.0_1727357752860.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nlpfinalbert0_pipeline_en_5.5.0_3.0_1727357752860.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("nlpfinalbert0_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("nlpfinalbert0_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nlpfinalbert0_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|691.6 MB| + +## References + +https://huggingface.co/zelihami/nlpfinalbert0 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-norwegian_bokml_bert_base_ctr_regression_en.md b/docs/_posts/ahmedlone127/2024-09-26-norwegian_bokml_bert_base_ctr_regression_en.md new file mode 100644 index 00000000000000..8c69968d7c7e5e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-norwegian_bokml_bert_base_ctr_regression_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English norwegian_bokml_bert_base_ctr_regression BertForSequenceClassification from thusken +author: John Snow Labs +name: norwegian_bokml_bert_base_ctr_regression +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`norwegian_bokml_bert_base_ctr_regression` is a English model originally trained by thusken. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/norwegian_bokml_bert_base_ctr_regression_en_5.5.0_3.0_1727311977136.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/norwegian_bokml_bert_base_ctr_regression_en_5.5.0_3.0_1727311977136.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("norwegian_bokml_bert_base_ctr_regression","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("norwegian_bokml_bert_base_ctr_regression", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|norwegian_bokml_bert_base_ctr_regression| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|668.4 MB| + +## References + +https://huggingface.co/thusken/nb-bert-base-ctr-regression \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-norwegian_bokml_bert_finetuned_on_imdb_en.md b/docs/_posts/ahmedlone127/2024-09-26-norwegian_bokml_bert_finetuned_on_imdb_en.md new file mode 100644 index 00000000000000..6c2a9b1076ffe4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-norwegian_bokml_bert_finetuned_on_imdb_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English norwegian_bokml_bert_finetuned_on_imdb BertForSequenceClassification from karolill +author: John Snow Labs +name: norwegian_bokml_bert_finetuned_on_imdb +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`norwegian_bokml_bert_finetuned_on_imdb` is a English model originally trained by karolill. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/norwegian_bokml_bert_finetuned_on_imdb_en_5.5.0_3.0_1727350332777.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/norwegian_bokml_bert_finetuned_on_imdb_en_5.5.0_3.0_1727350332777.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("norwegian_bokml_bert_finetuned_on_imdb","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("norwegian_bokml_bert_finetuned_on_imdb", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|norwegian_bokml_bert_finetuned_on_imdb| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/karolill/nb-bert-finetuned-on-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-norwegian_bokml_bert_finetuned_on_imdb_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-norwegian_bokml_bert_finetuned_on_imdb_pipeline_en.md new file mode 100644 index 00000000000000..04f080d80d57cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-norwegian_bokml_bert_finetuned_on_imdb_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English norwegian_bokml_bert_finetuned_on_imdb_pipeline pipeline BertForSequenceClassification from karolill +author: John Snow Labs +name: norwegian_bokml_bert_finetuned_on_imdb_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`norwegian_bokml_bert_finetuned_on_imdb_pipeline` is a English model originally trained by karolill. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/norwegian_bokml_bert_finetuned_on_imdb_pipeline_en_5.5.0_3.0_1727350400236.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/norwegian_bokml_bert_finetuned_on_imdb_pipeline_en_5.5.0_3.0_1727350400236.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("norwegian_bokml_bert_finetuned_on_imdb_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("norwegian_bokml_bert_finetuned_on_imdb_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|norwegian_bokml_bert_finetuned_on_imdb_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/karolill/nb-bert-finetuned-on-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-nova_threat_analyzer_swe_pipeline_sv.md b/docs/_posts/ahmedlone127/2024-09-26-nova_threat_analyzer_swe_pipeline_sv.md new file mode 100644 index 00000000000000..f1a9edc5313d75 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-nova_threat_analyzer_swe_pipeline_sv.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Swedish nova_threat_analyzer_swe_pipeline pipeline BertForSequenceClassification from Arro94 +author: John Snow Labs +name: nova_threat_analyzer_swe_pipeline +date: 2024-09-26 +tags: [sv, open_source, pipeline, onnx] +task: Text Classification +language: sv +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nova_threat_analyzer_swe_pipeline` is a Swedish model originally trained by Arro94. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nova_threat_analyzer_swe_pipeline_sv_5.5.0_3.0_1727321013423.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nova_threat_analyzer_swe_pipeline_sv_5.5.0_3.0_1727321013423.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("nova_threat_analyzer_swe_pipeline", lang = "sv") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("nova_threat_analyzer_swe_pipeline", lang = "sv") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nova_threat_analyzer_swe_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|sv| +|Size:|467.5 MB| + +## References + +https://huggingface.co/Arro94/nova-threat-analyzer-swe + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-nova_threat_analyzer_swe_sv.md b/docs/_posts/ahmedlone127/2024-09-26-nova_threat_analyzer_swe_sv.md new file mode 100644 index 00000000000000..cbd56e3c1d9213 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-nova_threat_analyzer_swe_sv.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Swedish nova_threat_analyzer_swe BertForSequenceClassification from Arro94 +author: John Snow Labs +name: nova_threat_analyzer_swe +date: 2024-09-26 +tags: [sv, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: sv +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nova_threat_analyzer_swe` is a Swedish model originally trained by Arro94. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nova_threat_analyzer_swe_sv_5.5.0_3.0_1727320986714.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nova_threat_analyzer_swe_sv_5.5.0_3.0_1727320986714.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("nova_threat_analyzer_swe","sv") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("nova_threat_analyzer_swe", "sv") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nova_threat_analyzer_swe| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|sv| +|Size:|467.4 MB| + +## References + +https://huggingface.co/Arro94/nova-threat-analyzer-swe \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-ocr8_bert_base_uncased_en.md b/docs/_posts/ahmedlone127/2024-09-26-ocr8_bert_base_uncased_en.md new file mode 100644 index 00000000000000..b74d16ac331682 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-ocr8_bert_base_uncased_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English ocr8_bert_base_uncased BertForSequenceClassification from sebastiencormier +author: John Snow Labs +name: ocr8_bert_base_uncased +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ocr8_bert_base_uncased` is a English model originally trained by sebastiencormier. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ocr8_bert_base_uncased_en_5.5.0_3.0_1727345301245.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ocr8_bert_base_uncased_en_5.5.0_3.0_1727345301245.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("ocr8_bert_base_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("ocr8_bert_base_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ocr8_bert_base_uncased| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/sebastiencormier/ocr8_bert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-ocr8_bert_base_uncased_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-ocr8_bert_base_uncased_pipeline_en.md new file mode 100644 index 00000000000000..1dd89820106d2b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-ocr8_bert_base_uncased_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English ocr8_bert_base_uncased_pipeline pipeline BertForSequenceClassification from sebastiencormier +author: John Snow Labs +name: ocr8_bert_base_uncased_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ocr8_bert_base_uncased_pipeline` is a English model originally trained by sebastiencormier. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ocr8_bert_base_uncased_pipeline_en_5.5.0_3.0_1727345322429.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ocr8_bert_base_uncased_pipeline_en_5.5.0_3.0_1727345322429.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("ocr8_bert_base_uncased_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("ocr8_bert_base_uncased_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ocr8_bert_base_uncased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/sebastiencormier/ocr8_bert-base-uncased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-odia_topic_all_doc_pipeline_or.md b/docs/_posts/ahmedlone127/2024-09-26-odia_topic_all_doc_pipeline_or.md new file mode 100644 index 00000000000000..be5d2ceba30262 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-odia_topic_all_doc_pipeline_or.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Oriya (macrolanguage) odia_topic_all_doc_pipeline pipeline BertForSequenceClassification from l3cube-pune +author: John Snow Labs +name: odia_topic_all_doc_pipeline +date: 2024-09-26 +tags: [or, open_source, pipeline, onnx] +task: Text Classification +language: or +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`odia_topic_all_doc_pipeline` is a Oriya (macrolanguage) model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/odia_topic_all_doc_pipeline_or_5.5.0_3.0_1727369060155.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/odia_topic_all_doc_pipeline_or_5.5.0_3.0_1727369060155.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("odia_topic_all_doc_pipeline", lang = "or") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("odia_topic_all_doc_pipeline", lang = "or") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|odia_topic_all_doc_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|or| +|Size:|892.7 MB| + +## References + +https://huggingface.co/l3cube-pune/odia-topic-all-doc + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-paraphrase_bert_portuguese_en.md b/docs/_posts/ahmedlone127/2024-09-26-paraphrase_bert_portuguese_en.md new file mode 100644 index 00000000000000..e25c0140ccad62 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-paraphrase_bert_portuguese_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English paraphrase_bert_portuguese BertForSequenceClassification from erickrribeiro +author: John Snow Labs +name: paraphrase_bert_portuguese +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`paraphrase_bert_portuguese` is a English model originally trained by erickrribeiro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/paraphrase_bert_portuguese_en_5.5.0_3.0_1727356353188.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/paraphrase_bert_portuguese_en_5.5.0_3.0_1727356353188.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("paraphrase_bert_portuguese","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("paraphrase_bert_portuguese", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|paraphrase_bert_portuguese| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/erickrribeiro/paraphrase-bert-portuguese \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-paraphrase_detection_bert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-paraphrase_detection_bert_pipeline_en.md new file mode 100644 index 00000000000000..de8645ec8a0c91 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-paraphrase_detection_bert_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English paraphrase_detection_bert_pipeline pipeline BertForSequenceClassification from Isezerano +author: John Snow Labs +name: paraphrase_detection_bert_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`paraphrase_detection_bert_pipeline` is a English model originally trained by Isezerano. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/paraphrase_detection_bert_pipeline_en_5.5.0_3.0_1727340764717.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/paraphrase_detection_bert_pipeline_en_5.5.0_3.0_1727340764717.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("paraphrase_detection_bert_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("paraphrase_detection_bert_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|paraphrase_detection_bert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Isezerano/paraphrase_detection_bert + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-parsbert_base_parsinlu_entailment_fa.md b/docs/_posts/ahmedlone127/2024-09-26-parsbert_base_parsinlu_entailment_fa.md new file mode 100644 index 00000000000000..ce0c23d14a0a3b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-parsbert_base_parsinlu_entailment_fa.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Persian parsbert_base_parsinlu_entailment BertForSequenceClassification from persiannlp +author: John Snow Labs +name: parsbert_base_parsinlu_entailment +date: 2024-09-26 +tags: [fa, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: fa +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`parsbert_base_parsinlu_entailment` is a Persian model originally trained by persiannlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/parsbert_base_parsinlu_entailment_fa_5.5.0_3.0_1727313883928.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/parsbert_base_parsinlu_entailment_fa_5.5.0_3.0_1727313883928.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("parsbert_base_parsinlu_entailment","fa") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("parsbert_base_parsinlu_entailment", "fa") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|parsbert_base_parsinlu_entailment| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|fa| +|Size:|608.7 MB| + +## References + +https://huggingface.co/persiannlp/parsbert-base-parsinlu-entailment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-parsbert_base_parsinlu_entailment_pipeline_fa.md b/docs/_posts/ahmedlone127/2024-09-26-parsbert_base_parsinlu_entailment_pipeline_fa.md new file mode 100644 index 00000000000000..bca6118f2d655e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-parsbert_base_parsinlu_entailment_pipeline_fa.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Persian parsbert_base_parsinlu_entailment_pipeline pipeline BertForSequenceClassification from persiannlp +author: John Snow Labs +name: parsbert_base_parsinlu_entailment_pipeline +date: 2024-09-26 +tags: [fa, open_source, pipeline, onnx] +task: Text Classification +language: fa +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`parsbert_base_parsinlu_entailment_pipeline` is a Persian model originally trained by persiannlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/parsbert_base_parsinlu_entailment_pipeline_fa_5.5.0_3.0_1727313917653.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/parsbert_base_parsinlu_entailment_pipeline_fa_5.5.0_3.0_1727313917653.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("parsbert_base_parsinlu_entailment_pipeline", lang = "fa") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("parsbert_base_parsinlu_entailment_pipeline", lang = "fa") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|parsbert_base_parsinlu_entailment_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|fa| +|Size:|608.8 MB| + +## References + +https://huggingface.co/persiannlp/parsbert-base-parsinlu-entailment + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-parsbert_text_emotion_classification_fa.md b/docs/_posts/ahmedlone127/2024-09-26-parsbert_text_emotion_classification_fa.md new file mode 100644 index 00000000000000..6c58d674f62845 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-parsbert_text_emotion_classification_fa.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Persian parsbert_text_emotion_classification BertForSequenceClassification from NLPclass +author: John Snow Labs +name: parsbert_text_emotion_classification +date: 2024-09-26 +tags: [fa, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: fa +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`parsbert_text_emotion_classification` is a Persian model originally trained by NLPclass. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/parsbert_text_emotion_classification_fa_5.5.0_3.0_1727329585727.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/parsbert_text_emotion_classification_fa_5.5.0_3.0_1727329585727.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("parsbert_text_emotion_classification","fa") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("parsbert_text_emotion_classification", "fa") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|parsbert_text_emotion_classification| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|fa| +|Size:|608.7 MB| + +## References + +https://huggingface.co/NLPclass/parsBERT_text_emotion_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-parsbert_text_emotion_classification_pipeline_fa.md b/docs/_posts/ahmedlone127/2024-09-26-parsbert_text_emotion_classification_pipeline_fa.md new file mode 100644 index 00000000000000..1f1db101fcd8ab --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-parsbert_text_emotion_classification_pipeline_fa.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Persian parsbert_text_emotion_classification_pipeline pipeline BertForSequenceClassification from NLPclass +author: John Snow Labs +name: parsbert_text_emotion_classification_pipeline +date: 2024-09-26 +tags: [fa, open_source, pipeline, onnx] +task: Text Classification +language: fa +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`parsbert_text_emotion_classification_pipeline` is a Persian model originally trained by NLPclass. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/parsbert_text_emotion_classification_pipeline_fa_5.5.0_3.0_1727329617093.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/parsbert_text_emotion_classification_pipeline_fa_5.5.0_3.0_1727329617093.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("parsbert_text_emotion_classification_pipeline", lang = "fa") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("parsbert_text_emotion_classification_pipeline", lang = "fa") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|parsbert_text_emotion_classification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|fa| +|Size:|608.7 MB| + +## References + +https://huggingface.co/NLPclass/parsBERT_text_emotion_classification + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-peace_hatebert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-peace_hatebert_pipeline_en.md new file mode 100644 index 00000000000000..3beebf349f1b8d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-peace_hatebert_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English peace_hatebert_pipeline pipeline BertForSequenceClassification from BenjaminOcampo +author: John Snow Labs +name: peace_hatebert_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`peace_hatebert_pipeline` is a English model originally trained by BenjaminOcampo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/peace_hatebert_pipeline_en_5.5.0_3.0_1727341850041.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/peace_hatebert_pipeline_en_5.5.0_3.0_1727341850041.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("peace_hatebert_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("peace_hatebert_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|peace_hatebert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|408.4 MB| + +## References + +https://huggingface.co/BenjaminOcampo/peace_hatebert + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_akshay7_en.md b/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_akshay7_en.md new file mode 100644 index 00000000000000..9cdd77ece6368b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_akshay7_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English phrasebank_sentiment_analysis_akshay7 BertForSequenceClassification from akshay7 +author: John Snow Labs +name: phrasebank_sentiment_analysis_akshay7 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phrasebank_sentiment_analysis_akshay7` is a English model originally trained by akshay7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_akshay7_en_5.5.0_3.0_1727311679017.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_akshay7_en_5.5.0_3.0_1727311679017.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("phrasebank_sentiment_analysis_akshay7","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("phrasebank_sentiment_analysis_akshay7", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phrasebank_sentiment_analysis_akshay7| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/akshay7/phrasebank-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_fredmulligan_en.md b/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_fredmulligan_en.md new file mode 100644 index 00000000000000..e6e721bce8b0c1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_fredmulligan_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English phrasebank_sentiment_analysis_fredmulligan BertForSequenceClassification from fredmulligan +author: John Snow Labs +name: phrasebank_sentiment_analysis_fredmulligan +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phrasebank_sentiment_analysis_fredmulligan` is a English model originally trained by fredmulligan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_fredmulligan_en_5.5.0_3.0_1727321763116.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_fredmulligan_en_5.5.0_3.0_1727321763116.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("phrasebank_sentiment_analysis_fredmulligan","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("phrasebank_sentiment_analysis_fredmulligan", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phrasebank_sentiment_analysis_fredmulligan| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/fredmulligan/phrasebank-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_girijesh_en.md b/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_girijesh_en.md new file mode 100644 index 00000000000000..9dc504de1ee56f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_girijesh_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English phrasebank_sentiment_analysis_girijesh BertForSequenceClassification from girijesh +author: John Snow Labs +name: phrasebank_sentiment_analysis_girijesh +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phrasebank_sentiment_analysis_girijesh` is a English model originally trained by girijesh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_girijesh_en_5.5.0_3.0_1727314269654.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_girijesh_en_5.5.0_3.0_1727314269654.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("phrasebank_sentiment_analysis_girijesh","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("phrasebank_sentiment_analysis_girijesh", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phrasebank_sentiment_analysis_girijesh| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/girijesh/phrasebank-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_girijesh_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_girijesh_pipeline_en.md new file mode 100644 index 00000000000000..a7c4fe2cdb5196 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_girijesh_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English phrasebank_sentiment_analysis_girijesh_pipeline pipeline BertForSequenceClassification from girijesh +author: John Snow Labs +name: phrasebank_sentiment_analysis_girijesh_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phrasebank_sentiment_analysis_girijesh_pipeline` is a English model originally trained by girijesh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_girijesh_pipeline_en_5.5.0_3.0_1727314291164.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_girijesh_pipeline_en_5.5.0_3.0_1727314291164.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("phrasebank_sentiment_analysis_girijesh_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("phrasebank_sentiment_analysis_girijesh_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phrasebank_sentiment_analysis_girijesh_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/girijesh/phrasebank-sentiment-analysis + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_priyabrata018_en.md b/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_priyabrata018_en.md new file mode 100644 index 00000000000000..4f1318a2c89f7c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_priyabrata018_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English phrasebank_sentiment_analysis_priyabrata018 BertForSequenceClassification from Priyabrata018 +author: John Snow Labs +name: phrasebank_sentiment_analysis_priyabrata018 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phrasebank_sentiment_analysis_priyabrata018` is a English model originally trained by Priyabrata018. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_priyabrata018_en_5.5.0_3.0_1727336830550.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_priyabrata018_en_5.5.0_3.0_1727336830550.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("phrasebank_sentiment_analysis_priyabrata018","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("phrasebank_sentiment_analysis_priyabrata018", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phrasebank_sentiment_analysis_priyabrata018| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Priyabrata018/phrasebank-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_santis2_en.md b/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_santis2_en.md new file mode 100644 index 00000000000000..a2011341c97250 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_santis2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English phrasebank_sentiment_analysis_santis2 BertForSequenceClassification from santis2 +author: John Snow Labs +name: phrasebank_sentiment_analysis_santis2 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phrasebank_sentiment_analysis_santis2` is a English model originally trained by santis2. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_santis2_en_5.5.0_3.0_1727340871608.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_santis2_en_5.5.0_3.0_1727340871608.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("phrasebank_sentiment_analysis_santis2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("phrasebank_sentiment_analysis_santis2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phrasebank_sentiment_analysis_santis2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/santis2/phrasebank-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_scbtm_en.md b/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_scbtm_en.md new file mode 100644 index 00000000000000..a26c37ffba590e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_scbtm_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English phrasebank_sentiment_analysis_scbtm BertForSequenceClassification from scbtm +author: John Snow Labs +name: phrasebank_sentiment_analysis_scbtm +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phrasebank_sentiment_analysis_scbtm` is a English model originally trained by scbtm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_scbtm_en_5.5.0_3.0_1727339398640.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_scbtm_en_5.5.0_3.0_1727339398640.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("phrasebank_sentiment_analysis_scbtm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("phrasebank_sentiment_analysis_scbtm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phrasebank_sentiment_analysis_scbtm| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/scbtm/phrasebank-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_scbtm_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_scbtm_pipeline_en.md new file mode 100644 index 00000000000000..f0cf4ecf7fd9bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_scbtm_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English phrasebank_sentiment_analysis_scbtm_pipeline pipeline BertForSequenceClassification from scbtm +author: John Snow Labs +name: phrasebank_sentiment_analysis_scbtm_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phrasebank_sentiment_analysis_scbtm_pipeline` is a English model originally trained by scbtm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_scbtm_pipeline_en_5.5.0_3.0_1727339419749.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_scbtm_pipeline_en_5.5.0_3.0_1727339419749.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("phrasebank_sentiment_analysis_scbtm_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("phrasebank_sentiment_analysis_scbtm_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phrasebank_sentiment_analysis_scbtm_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/scbtm/phrasebank-sentiment-analysis + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_sirenstitches_en.md b/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_sirenstitches_en.md new file mode 100644 index 00000000000000..c29f9b5ac108ee --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_sirenstitches_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English phrasebank_sentiment_analysis_sirenstitches BertForSequenceClassification from sirenstitches +author: John Snow Labs +name: phrasebank_sentiment_analysis_sirenstitches +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phrasebank_sentiment_analysis_sirenstitches` is a English model originally trained by sirenstitches. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_sirenstitches_en_5.5.0_3.0_1727312318909.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_sirenstitches_en_5.5.0_3.0_1727312318909.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("phrasebank_sentiment_analysis_sirenstitches","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("phrasebank_sentiment_analysis_sirenstitches", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phrasebank_sentiment_analysis_sirenstitches| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/sirenstitches/phrasebank-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_sirenstitches_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_sirenstitches_pipeline_en.md new file mode 100644 index 00000000000000..9ab617e836eb47 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_sirenstitches_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English phrasebank_sentiment_analysis_sirenstitches_pipeline pipeline BertForSequenceClassification from sirenstitches +author: John Snow Labs +name: phrasebank_sentiment_analysis_sirenstitches_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phrasebank_sentiment_analysis_sirenstitches_pipeline` is a English model originally trained by sirenstitches. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_sirenstitches_pipeline_en_5.5.0_3.0_1727312339770.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_sirenstitches_pipeline_en_5.5.0_3.0_1727312339770.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("phrasebank_sentiment_analysis_sirenstitches_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("phrasebank_sentiment_analysis_sirenstitches_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phrasebank_sentiment_analysis_sirenstitches_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/sirenstitches/phrasebank-sentiment-analysis + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_snowc2023_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_snowc2023_pipeline_en.md new file mode 100644 index 00000000000000..6d8c41450d7ca6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-phrasebank_sentiment_analysis_snowc2023_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English phrasebank_sentiment_analysis_snowc2023_pipeline pipeline BertForSequenceClassification from snowc2023 +author: John Snow Labs +name: phrasebank_sentiment_analysis_snowc2023_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phrasebank_sentiment_analysis_snowc2023_pipeline` is a English model originally trained by snowc2023. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_snowc2023_pipeline_en_5.5.0_3.0_1727342622254.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_snowc2023_pipeline_en_5.5.0_3.0_1727342622254.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("phrasebank_sentiment_analysis_snowc2023_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("phrasebank_sentiment_analysis_snowc2023_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phrasebank_sentiment_analysis_snowc2023_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/snowc2023/phrasebank-sentiment-analysis + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-protein_lm_gb1_en.md b/docs/_posts/ahmedlone127/2024-09-26-protein_lm_gb1_en.md new file mode 100644 index 00000000000000..c346432df30d05 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-protein_lm_gb1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English protein_lm_gb1 BertForSequenceClassification from mcneela +author: John Snow Labs +name: protein_lm_gb1 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`protein_lm_gb1` is a English model originally trained by mcneela. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/protein_lm_gb1_en_5.5.0_3.0_1727359721399.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/protein_lm_gb1_en_5.5.0_3.0_1727359721399.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("protein_lm_gb1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("protein_lm_gb1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|protein_lm_gb1| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|864.5 MB| + +## References + +https://huggingface.co/mcneela/protein-lm-gb1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-protein_lm_gb1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-protein_lm_gb1_pipeline_en.md new file mode 100644 index 00000000000000..644f33ff9d670b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-protein_lm_gb1_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English protein_lm_gb1_pipeline pipeline BertForSequenceClassification from mcneela +author: John Snow Labs +name: protein_lm_gb1_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`protein_lm_gb1_pipeline` is a English model originally trained by mcneela. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/protein_lm_gb1_pipeline_en_5.5.0_3.0_1727359765477.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/protein_lm_gb1_pipeline_en_5.5.0_3.0_1727359765477.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("protein_lm_gb1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("protein_lm_gb1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|protein_lm_gb1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|864.6 MB| + +## References + +https://huggingface.co/mcneela/protein-lm-gb1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-pruned_model_en.md b/docs/_posts/ahmedlone127/2024-09-26-pruned_model_en.md new file mode 100644 index 00000000000000..bbe73bacb6a1d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-pruned_model_en.md @@ -0,0 +1,92 @@ +--- +layout: model +title: English pruned_model DistilBertForQuestionAnswering from vxbrandon +author: John Snow Labs +name: pruned_model +date: 2024-09-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pruned_model` is a English model originally trained by vxbrandon. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pruned_model_en_5.5.0_3.0_1727322670497.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pruned_model_en_5.5.0_3.0_1727322670497.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("pruned_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) +``` +```scala +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("pruned_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pruned_model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.2 MB| + +## References + +References + +https://huggingface.co/vxbrandon/pruned_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-ptv2_bert_large_uncased_sst2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-ptv2_bert_large_uncased_sst2_pipeline_en.md new file mode 100644 index 00000000000000..4c80ac6a68eb19 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-ptv2_bert_large_uncased_sst2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English ptv2_bert_large_uncased_sst2_pipeline pipeline BertForSequenceClassification from AudreyTrungNguyen +author: John Snow Labs +name: ptv2_bert_large_uncased_sst2_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ptv2_bert_large_uncased_sst2_pipeline` is a English model originally trained by AudreyTrungNguyen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ptv2_bert_large_uncased_sst2_pipeline_en_5.5.0_3.0_1727317308297.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ptv2_bert_large_uncased_sst2_pipeline_en_5.5.0_3.0_1727317308297.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("ptv2_bert_large_uncased_sst2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("ptv2_bert_large_uncased_sst2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ptv2_bert_large_uncased_sst2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/AudreyTrungNguyen/ptv2-bert-large-uncased-sst2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-pure_python2_en.md b/docs/_posts/ahmedlone127/2024-09-26-pure_python2_en.md new file mode 100644 index 00000000000000..082ac960b363e3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-pure_python2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English pure_python2 BertForSequenceClassification from Zahra99 +author: John Snow Labs +name: pure_python2 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pure_python2` is a English model originally trained by Zahra99. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pure_python2_en_5.5.0_3.0_1727342153189.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pure_python2_en_5.5.0_3.0_1727342153189.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("pure_python2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("pure_python2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pure_python2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Zahra99/pure-python2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-pure_python2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-pure_python2_pipeline_en.md new file mode 100644 index 00000000000000..a9d160e93d7ee5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-pure_python2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English pure_python2_pipeline pipeline BertForSequenceClassification from Zahra99 +author: John Snow Labs +name: pure_python2_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pure_python2_pipeline` is a English model originally trained by Zahra99. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pure_python2_pipeline_en_5.5.0_3.0_1727342174530.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pure_python2_pipeline_en_5.5.0_3.0_1727342174530.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("pure_python2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("pure_python2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pure_python2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Zahra99/pure-python2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-push_tonga_tonga_islands_hub_test_2_en.md b/docs/_posts/ahmedlone127/2024-09-26-push_tonga_tonga_islands_hub_test_2_en.md new file mode 100644 index 00000000000000..9b22e4c7d40fd1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-push_tonga_tonga_islands_hub_test_2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English push_tonga_tonga_islands_hub_test_2 BertForSequenceClassification from sgugger +author: John Snow Labs +name: push_tonga_tonga_islands_hub_test_2 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`push_tonga_tonga_islands_hub_test_2` is a English model originally trained by sgugger. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/push_tonga_tonga_islands_hub_test_2_en_5.5.0_3.0_1727339882772.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/push_tonga_tonga_islands_hub_test_2_en_5.5.0_3.0_1727339882772.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("push_tonga_tonga_islands_hub_test_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("push_tonga_tonga_islands_hub_test_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|push_tonga_tonga_islands_hub_test_2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/sgugger/push-to-hub-test-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-push_tonga_tonga_islands_hub_test_2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-push_tonga_tonga_islands_hub_test_2_pipeline_en.md new file mode 100644 index 00000000000000..e469d46e611205 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-push_tonga_tonga_islands_hub_test_2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English push_tonga_tonga_islands_hub_test_2_pipeline pipeline BertForSequenceClassification from sgugger +author: John Snow Labs +name: push_tonga_tonga_islands_hub_test_2_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`push_tonga_tonga_islands_hub_test_2_pipeline` is a English model originally trained by sgugger. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/push_tonga_tonga_islands_hub_test_2_pipeline_en_5.5.0_3.0_1727339904037.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/push_tonga_tonga_islands_hub_test_2_pipeline_en_5.5.0_3.0_1727339904037.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("push_tonga_tonga_islands_hub_test_2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("push_tonga_tonga_islands_hub_test_2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|push_tonga_tonga_islands_hub_test_2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/sgugger/push-to-hub-test-2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-qd_tweet_convbert_base_turkish_en.md b/docs/_posts/ahmedlone127/2024-09-26-qd_tweet_convbert_base_turkish_en.md new file mode 100644 index 00000000000000..7ed4144d8bb209 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-qd_tweet_convbert_base_turkish_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English qd_tweet_convbert_base_turkish BertForSequenceClassification from Izzet +author: John Snow Labs +name: qd_tweet_convbert_base_turkish +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qd_tweet_convbert_base_turkish` is a English model originally trained by Izzet. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qd_tweet_convbert_base_turkish_en_5.5.0_3.0_1727352674084.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qd_tweet_convbert_base_turkish_en_5.5.0_3.0_1727352674084.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("qd_tweet_convbert_base_turkish","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("qd_tweet_convbert_base_turkish", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qd_tweet_convbert_base_turkish| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|402.3 MB| + +## References + +https://huggingface.co/Izzet/qd_tweet_convbert-base-turkish \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-query_product_relevance_model_ecommerce_en.md b/docs/_posts/ahmedlone127/2024-09-26-query_product_relevance_model_ecommerce_en.md new file mode 100644 index 00000000000000..9b4a49db48f248 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-query_product_relevance_model_ecommerce_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English query_product_relevance_model_ecommerce BertForSequenceClassification from prhegde +author: John Snow Labs +name: query_product_relevance_model_ecommerce +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`query_product_relevance_model_ecommerce` is a English model originally trained by prhegde. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/query_product_relevance_model_ecommerce_en_5.5.0_3.0_1727341229825.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/query_product_relevance_model_ecommerce_en_5.5.0_3.0_1727341229825.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("query_product_relevance_model_ecommerce","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("query_product_relevance_model_ecommerce", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|query_product_relevance_model_ecommerce| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/prhegde/query-product-relevance-model-ecommerce \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-question_detection_user_utter_en.md b/docs/_posts/ahmedlone127/2024-09-26-question_detection_user_utter_en.md new file mode 100644 index 00000000000000..3a92b453162203 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-question_detection_user_utter_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English question_detection_user_utter BertForSequenceClassification from huaen +author: John Snow Labs +name: question_detection_user_utter +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`question_detection_user_utter` is a English model originally trained by huaen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/question_detection_user_utter_en_5.5.0_3.0_1727318959073.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/question_detection_user_utter_en_5.5.0_3.0_1727318959073.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("question_detection_user_utter","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("question_detection_user_utter", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|question_detection_user_utter| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/huaen/question_detection_user_utter \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-repurchase_train5_en.md b/docs/_posts/ahmedlone127/2024-09-26-repurchase_train5_en.md new file mode 100644 index 00000000000000..fffb6ab88e6b00 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-repurchase_train5_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English repurchase_train5 BertForSequenceClassification from laskovey +author: John Snow Labs +name: repurchase_train5 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`repurchase_train5` is a English model originally trained by laskovey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/repurchase_train5_en_5.5.0_3.0_1727318844466.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/repurchase_train5_en_5.5.0_3.0_1727318844466.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("repurchase_train5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("repurchase_train5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|repurchase_train5| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|109.5 MB| + +## References + +https://huggingface.co/laskovey/repurchase_train5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-repurchase_train5_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-repurchase_train5_pipeline_en.md new file mode 100644 index 00000000000000..17a9c370230157 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-repurchase_train5_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English repurchase_train5_pipeline pipeline BertForSequenceClassification from laskovey +author: John Snow Labs +name: repurchase_train5_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`repurchase_train5_pipeline` is a English model originally trained by laskovey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/repurchase_train5_pipeline_en_5.5.0_3.0_1727318850414.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/repurchase_train5_pipeline_en_5.5.0_3.0_1727318850414.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("repurchase_train5_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("repurchase_train5_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|repurchase_train5_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|109.5 MB| + +## References + +https://huggingface.co/laskovey/repurchase_train5 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-response_quality_classifier_base_ru.md b/docs/_posts/ahmedlone127/2024-09-26-response_quality_classifier_base_ru.md new file mode 100644 index 00000000000000..a9dcaa845fa3ec --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-response_quality_classifier_base_ru.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Russian response_quality_classifier_base BertForSequenceClassification from t-bank-ai +author: John Snow Labs +name: response_quality_classifier_base +date: 2024-09-26 +tags: [ru, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: ru +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`response_quality_classifier_base` is a Russian model originally trained by t-bank-ai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/response_quality_classifier_base_ru_5.5.0_3.0_1727360012409.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/response_quality_classifier_base_ru_5.5.0_3.0_1727360012409.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("response_quality_classifier_base","ru") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("response_quality_classifier_base", "ru") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|response_quality_classifier_base| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|666.6 MB| + +## References + +https://huggingface.co/t-bank-ai/response-quality-classifier-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-reviewusefulness_binaryclassification_de.md b/docs/_posts/ahmedlone127/2024-09-26-reviewusefulness_binaryclassification_de.md new file mode 100644 index 00000000000000..910e2be958697d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-reviewusefulness_binaryclassification_de.md @@ -0,0 +1,94 @@ +--- +layout: model +title: German reviewusefulness_binaryclassification BertForSequenceClassification from jorgeortizv +author: John Snow Labs +name: reviewusefulness_binaryclassification +date: 2024-09-26 +tags: [de, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: de +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`reviewusefulness_binaryclassification` is a German model originally trained by jorgeortizv. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/reviewusefulness_binaryclassification_de_5.5.0_3.0_1727367987382.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/reviewusefulness_binaryclassification_de_5.5.0_3.0_1727367987382.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("reviewusefulness_binaryclassification","de") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("reviewusefulness_binaryclassification", "de") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|reviewusefulness_binaryclassification| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|de| +|Size:|409.1 MB| + +## References + +https://huggingface.co/jorgeortizv/reviewUsefulness-binaryClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-rewardmodel_rajueee_en.md b/docs/_posts/ahmedlone127/2024-09-26-rewardmodel_rajueee_en.md new file mode 100644 index 00000000000000..b23d41b5637993 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-rewardmodel_rajueee_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English rewardmodel_rajueee BertForSequenceClassification from RajuEEE +author: John Snow Labs +name: rewardmodel_rajueee +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rewardmodel_rajueee` is a English model originally trained by RajuEEE. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rewardmodel_rajueee_en_5.5.0_3.0_1727343399128.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rewardmodel_rajueee_en_5.5.0_3.0_1727343399128.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("rewardmodel_rajueee","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rewardmodel_rajueee", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rewardmodel_rajueee| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/RajuEEE/RewardModel \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-rewardmodel_rajueee_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-rewardmodel_rajueee_pipeline_en.md new file mode 100644 index 00000000000000..b1f8f2681a35b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-rewardmodel_rajueee_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English rewardmodel_rajueee_pipeline pipeline BertForSequenceClassification from RajuEEE +author: John Snow Labs +name: rewardmodel_rajueee_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rewardmodel_rajueee_pipeline` is a English model originally trained by RajuEEE. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rewardmodel_rajueee_pipeline_en_5.5.0_3.0_1727343421228.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rewardmodel_rajueee_pipeline_en_5.5.0_3.0_1727343421228.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("rewardmodel_rajueee_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("rewardmodel_rajueee_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rewardmodel_rajueee_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/RajuEEE/RewardModel + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-rewardmodelpt_pipeline_pt.md b/docs/_posts/ahmedlone127/2024-09-26-rewardmodelpt_pipeline_pt.md new file mode 100644 index 00000000000000..8eba75d414db62 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-rewardmodelpt_pipeline_pt.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Portuguese rewardmodelpt_pipeline pipeline BertForSequenceClassification from nicholasKluge +author: John Snow Labs +name: rewardmodelpt_pipeline +date: 2024-09-26 +tags: [pt, open_source, pipeline, onnx] +task: Text Classification +language: pt +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rewardmodelpt_pipeline` is a Portuguese model originally trained by nicholasKluge. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rewardmodelpt_pipeline_pt_5.5.0_3.0_1727329426030.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rewardmodelpt_pipeline_pt_5.5.0_3.0_1727329426030.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("rewardmodelpt_pipeline", lang = "pt") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("rewardmodelpt_pipeline", lang = "pt") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rewardmodelpt_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|pt| +|Size:|408.2 MB| + +## References + +https://huggingface.co/nicholasKluge/RewardModelPT + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-roberta_base_finetuned_lcqmc_chinese_pipeline_zh.md b/docs/_posts/ahmedlone127/2024-09-26-roberta_base_finetuned_lcqmc_chinese_pipeline_zh.md new file mode 100644 index 00000000000000..4ec953b04768a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-roberta_base_finetuned_lcqmc_chinese_pipeline_zh.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Chinese roberta_base_finetuned_lcqmc_chinese_pipeline pipeline BertForSequenceClassification from WangA +author: John Snow Labs +name: roberta_base_finetuned_lcqmc_chinese_pipeline +date: 2024-09-26 +tags: [zh, open_source, pipeline, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_finetuned_lcqmc_chinese_pipeline` is a Chinese model originally trained by WangA. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_lcqmc_chinese_pipeline_zh_5.5.0_3.0_1727357859133.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_lcqmc_chinese_pipeline_zh_5.5.0_3.0_1727357859133.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("roberta_base_finetuned_lcqmc_chinese_pipeline", lang = "zh") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("roberta_base_finetuned_lcqmc_chinese_pipeline", lang = "zh") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_finetuned_lcqmc_chinese_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|zh| +|Size:|383.3 MB| + +## References + +https://huggingface.co/WangA/roberta-base-finetuned-lcqmc-chinese + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-roberta_chinese_specific_pipeline_zh.md b/docs/_posts/ahmedlone127/2024-09-26-roberta_chinese_specific_pipeline_zh.md new file mode 100644 index 00000000000000..d5ca4cc4fab153 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-roberta_chinese_specific_pipeline_zh.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Chinese roberta_chinese_specific_pipeline pipeline BertForSequenceClassification from thu-coai +author: John Snow Labs +name: roberta_chinese_specific_pipeline +date: 2024-09-26 +tags: [zh, open_source, pipeline, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_chinese_specific_pipeline` is a Chinese model originally trained by thu-coai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_chinese_specific_pipeline_zh_5.5.0_3.0_1727334121930.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_chinese_specific_pipeline_zh_5.5.0_3.0_1727334121930.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("roberta_chinese_specific_pipeline", lang = "zh") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("roberta_chinese_specific_pipeline", lang = "zh") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_chinese_specific_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|zh| +|Size:|383.3 MB| + +## References + +https://huggingface.co/thu-coai/roberta-zh-specific + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-robust_bert_jigsaw_en.md b/docs/_posts/ahmedlone127/2024-09-26-robust_bert_jigsaw_en.md new file mode 100644 index 00000000000000..fb995b00e4ef20 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-robust_bert_jigsaw_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English robust_bert_jigsaw BertForSequenceClassification from JiaqiLee +author: John Snow Labs +name: robust_bert_jigsaw +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`robust_bert_jigsaw` is a English model originally trained by JiaqiLee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/robust_bert_jigsaw_en_5.5.0_3.0_1727309878359.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/robust_bert_jigsaw_en_5.5.0_3.0_1727309878359.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("robust_bert_jigsaw","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("robust_bert_jigsaw", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|robust_bert_jigsaw| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/JiaqiLee/robust-bert-jigsaw \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-robust_bert_jigsaw_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-robust_bert_jigsaw_pipeline_en.md new file mode 100644 index 00000000000000..b2e8975618dc7b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-robust_bert_jigsaw_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English robust_bert_jigsaw_pipeline pipeline BertForSequenceClassification from JiaqiLee +author: John Snow Labs +name: robust_bert_jigsaw_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`robust_bert_jigsaw_pipeline` is a English model originally trained by JiaqiLee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/robust_bert_jigsaw_pipeline_en_5.5.0_3.0_1727309900155.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/robust_bert_jigsaw_pipeline_en_5.5.0_3.0_1727309900155.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("robust_bert_jigsaw_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("robust_bert_jigsaw_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|robust_bert_jigsaw_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/JiaqiLee/robust-bert-jigsaw + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-robust_mbert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-robust_mbert_pipeline_en.md new file mode 100644 index 00000000000000..7439fb88075247 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-robust_mbert_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English robust_mbert_pipeline pipeline BertForSequenceClassification from Anwaarma +author: John Snow Labs +name: robust_mbert_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`robust_mbert_pipeline` is a English model originally trained by Anwaarma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/robust_mbert_pipeline_en_5.5.0_3.0_1727319679027.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/robust_mbert_pipeline_en_5.5.0_3.0_1727319679027.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("robust_mbert_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("robust_mbert_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|robust_mbert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/Anwaarma/robust-mbert + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-ros2_text_classification_en.md b/docs/_posts/ahmedlone127/2024-09-26-ros2_text_classification_en.md new file mode 100644 index 00000000000000..c1d0df33dff6b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-ros2_text_classification_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English ros2_text_classification BertForSequenceClassification from anaaulian19 +author: John Snow Labs +name: ros2_text_classification +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ros2_text_classification` is a English model originally trained by anaaulian19. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ros2_text_classification_en_5.5.0_3.0_1727326341028.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ros2_text_classification_en_5.5.0_3.0_1727326341028.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("ros2_text_classification","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("ros2_text_classification", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ros2_text_classification| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.0 MB| + +## References + +https://huggingface.co/anaaulian19/ROS2-text-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-ros2_text_classification_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-ros2_text_classification_pipeline_en.md new file mode 100644 index 00000000000000..dbc5d4aab07e16 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-ros2_text_classification_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English ros2_text_classification_pipeline pipeline BertForSequenceClassification from anaaulian19 +author: John Snow Labs +name: ros2_text_classification_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ros2_text_classification_pipeline` is a English model originally trained by anaaulian19. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ros2_text_classification_pipeline_en_5.5.0_3.0_1727326363492.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ros2_text_classification_pipeline_en_5.5.0_3.0_1727326363492.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("ros2_text_classification_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("ros2_text_classification_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ros2_text_classification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|414.0 MB| + +## References + +https://huggingface.co/anaaulian19/ROS2-text-classification + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-sagemaker_bert_base_arabic_arabic_sas_en.md b/docs/_posts/ahmedlone127/2024-09-26-sagemaker_bert_base_arabic_arabic_sas_en.md new file mode 100644 index 00000000000000..f69b289b8d19b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-sagemaker_bert_base_arabic_arabic_sas_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sagemaker_bert_base_arabic_arabic_sas BertForSequenceClassification from Osaleh +author: John Snow Labs +name: sagemaker_bert_base_arabic_arabic_sas +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sagemaker_bert_base_arabic_arabic_sas` is a English model originally trained by Osaleh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sagemaker_bert_base_arabic_arabic_sas_en_5.5.0_3.0_1727358977968.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sagemaker_bert_base_arabic_arabic_sas_en_5.5.0_3.0_1727358977968.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("sagemaker_bert_base_arabic_arabic_sas","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sagemaker_bert_base_arabic_arabic_sas", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sagemaker_bert_base_arabic_arabic_sas| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.2 MB| + +## References + +https://huggingface.co/Osaleh/sagemaker-bert-base-arabic-ar-SAS \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-sagemaker_bert_base_arabic_arabic_sas_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-sagemaker_bert_base_arabic_arabic_sas_pipeline_en.md new file mode 100644 index 00000000000000..d747d753bda782 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-sagemaker_bert_base_arabic_arabic_sas_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English sagemaker_bert_base_arabic_arabic_sas_pipeline pipeline BertForSequenceClassification from Osaleh +author: John Snow Labs +name: sagemaker_bert_base_arabic_arabic_sas_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sagemaker_bert_base_arabic_arabic_sas_pipeline` is a English model originally trained by Osaleh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sagemaker_bert_base_arabic_arabic_sas_pipeline_en_5.5.0_3.0_1727358999355.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sagemaker_bert_base_arabic_arabic_sas_pipeline_en_5.5.0_3.0_1727358999355.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sagemaker_bert_base_arabic_arabic_sas_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sagemaker_bert_base_arabic_arabic_sas_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sagemaker_bert_base_arabic_arabic_sas_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|414.2 MB| + +## References + +https://huggingface.co/Osaleh/sagemaker-bert-base-arabic-ar-SAS + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-sarcasm_detection_bert_base_uncased_newdata_en.md b/docs/_posts/ahmedlone127/2024-09-26-sarcasm_detection_bert_base_uncased_newdata_en.md new file mode 100644 index 00000000000000..dfae9aeee6dff0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-sarcasm_detection_bert_base_uncased_newdata_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sarcasm_detection_bert_base_uncased_newdata BertForSequenceClassification from jkhan447 +author: John Snow Labs +name: sarcasm_detection_bert_base_uncased_newdata +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sarcasm_detection_bert_base_uncased_newdata` is a English model originally trained by jkhan447. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sarcasm_detection_bert_base_uncased_newdata_en_5.5.0_3.0_1727334942765.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sarcasm_detection_bert_base_uncased_newdata_en_5.5.0_3.0_1727334942765.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("sarcasm_detection_bert_base_uncased_newdata","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sarcasm_detection_bert_base_uncased_newdata", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sarcasm_detection_bert_base_uncased_newdata| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jkhan447/sarcasm-detection-Bert-base-uncased-newdata \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-scenario_tcr_data_glue_qnli_model_bert_base_uncased_en.md b/docs/_posts/ahmedlone127/2024-09-26-scenario_tcr_data_glue_qnli_model_bert_base_uncased_en.md new file mode 100644 index 00000000000000..d6fcb1e6098fd3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-scenario_tcr_data_glue_qnli_model_bert_base_uncased_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English scenario_tcr_data_glue_qnli_model_bert_base_uncased BertForSequenceClassification from haryoaw +author: John Snow Labs +name: scenario_tcr_data_glue_qnli_model_bert_base_uncased +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scenario_tcr_data_glue_qnli_model_bert_base_uncased` is a English model originally trained by haryoaw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scenario_tcr_data_glue_qnli_model_bert_base_uncased_en_5.5.0_3.0_1727312231768.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scenario_tcr_data_glue_qnli_model_bert_base_uncased_en_5.5.0_3.0_1727312231768.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("scenario_tcr_data_glue_qnli_model_bert_base_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("scenario_tcr_data_glue_qnli_model_bert_base_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scenario_tcr_data_glue_qnli_model_bert_base_uncased| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/haryoaw/scenario-TCR-data-glue-qnli-model-bert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-scenario_tcr_data_glue_qnli_model_bert_base_uncased_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-scenario_tcr_data_glue_qnli_model_bert_base_uncased_pipeline_en.md new file mode 100644 index 00000000000000..ce514aafecf05b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-scenario_tcr_data_glue_qnli_model_bert_base_uncased_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English scenario_tcr_data_glue_qnli_model_bert_base_uncased_pipeline pipeline BertForSequenceClassification from haryoaw +author: John Snow Labs +name: scenario_tcr_data_glue_qnli_model_bert_base_uncased_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scenario_tcr_data_glue_qnli_model_bert_base_uncased_pipeline` is a English model originally trained by haryoaw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scenario_tcr_data_glue_qnli_model_bert_base_uncased_pipeline_en_5.5.0_3.0_1727312253103.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scenario_tcr_data_glue_qnli_model_bert_base_uncased_pipeline_en_5.5.0_3.0_1727312253103.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("scenario_tcr_data_glue_qnli_model_bert_base_uncased_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("scenario_tcr_data_glue_qnli_model_bert_base_uncased_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scenario_tcr_data_glue_qnli_model_bert_base_uncased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/haryoaw/scenario-TCR-data-glue-qnli-model-bert-base-uncased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-scenario_tcr_data_glue_qqp_model_bert_base_uncased_en.md b/docs/_posts/ahmedlone127/2024-09-26-scenario_tcr_data_glue_qqp_model_bert_base_uncased_en.md new file mode 100644 index 00000000000000..0fdecb4ed7d60d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-scenario_tcr_data_glue_qqp_model_bert_base_uncased_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English scenario_tcr_data_glue_qqp_model_bert_base_uncased BertForSequenceClassification from haryoaw +author: John Snow Labs +name: scenario_tcr_data_glue_qqp_model_bert_base_uncased +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scenario_tcr_data_glue_qqp_model_bert_base_uncased` is a English model originally trained by haryoaw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scenario_tcr_data_glue_qqp_model_bert_base_uncased_en_5.5.0_3.0_1727315529238.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scenario_tcr_data_glue_qqp_model_bert_base_uncased_en_5.5.0_3.0_1727315529238.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("scenario_tcr_data_glue_qqp_model_bert_base_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("scenario_tcr_data_glue_qqp_model_bert_base_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scenario_tcr_data_glue_qqp_model_bert_base_uncased| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/haryoaw/scenario-TCR-data-glue-qqp-model-bert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-sead_l_6_h_256_a_8_qnli_en.md b/docs/_posts/ahmedlone127/2024-09-26-sead_l_6_h_256_a_8_qnli_en.md new file mode 100644 index 00000000000000..1ecd31add21cdf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-sead_l_6_h_256_a_8_qnli_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sead_l_6_h_256_a_8_qnli BertForSequenceClassification from C5i +author: John Snow Labs +name: sead_l_6_h_256_a_8_qnli +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sead_l_6_h_256_a_8_qnli` is a English model originally trained by C5i. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sead_l_6_h_256_a_8_qnli_en_5.5.0_3.0_1727363833713.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sead_l_6_h_256_a_8_qnli_en_5.5.0_3.0_1727363833713.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("sead_l_6_h_256_a_8_qnli","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sead_l_6_h_256_a_8_qnli", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sead_l_6_h_256_a_8_qnli| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|47.4 MB| + +## References + +https://huggingface.co/C5i/SEAD-L-6_H-256_A-8-qnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-sead_l_6_h_256_a_8_qnli_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-sead_l_6_h_256_a_8_qnli_pipeline_en.md new file mode 100644 index 00000000000000..7c6b64c2a5b890 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-sead_l_6_h_256_a_8_qnli_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English sead_l_6_h_256_a_8_qnli_pipeline pipeline BertForSequenceClassification from C5i +author: John Snow Labs +name: sead_l_6_h_256_a_8_qnli_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sead_l_6_h_256_a_8_qnli_pipeline` is a English model originally trained by C5i. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sead_l_6_h_256_a_8_qnli_pipeline_en_5.5.0_3.0_1727363836608.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sead_l_6_h_256_a_8_qnli_pipeline_en_5.5.0_3.0_1727363836608.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sead_l_6_h_256_a_8_qnli_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sead_l_6_h_256_a_8_qnli_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sead_l_6_h_256_a_8_qnli_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|47.4 MB| + +## References + +https://huggingface.co/C5i/SEAD-L-6_H-256_A-8-qnli + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-sead_l_6_h_256_a_8_sst2_en.md b/docs/_posts/ahmedlone127/2024-09-26-sead_l_6_h_256_a_8_sst2_en.md new file mode 100644 index 00000000000000..34f613e0f5f973 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-sead_l_6_h_256_a_8_sst2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sead_l_6_h_256_a_8_sst2 BertForSequenceClassification from C5i +author: John Snow Labs +name: sead_l_6_h_256_a_8_sst2 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sead_l_6_h_256_a_8_sst2` is a English model originally trained by C5i. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sead_l_6_h_256_a_8_sst2_en_5.5.0_3.0_1727359116485.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sead_l_6_h_256_a_8_sst2_en_5.5.0_3.0_1727359116485.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("sead_l_6_h_256_a_8_sst2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sead_l_6_h_256_a_8_sst2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sead_l_6_h_256_a_8_sst2| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|47.3 MB| + +## References + +https://huggingface.co/C5i/SEAD-L-6_H-256_A-8-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-sead_l_6_h_256_a_8_wnli_en.md b/docs/_posts/ahmedlone127/2024-09-26-sead_l_6_h_256_a_8_wnli_en.md new file mode 100644 index 00000000000000..af0c9b4fffa6d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-sead_l_6_h_256_a_8_wnli_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sead_l_6_h_256_a_8_wnli BertForSequenceClassification from C5i +author: John Snow Labs +name: sead_l_6_h_256_a_8_wnli +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sead_l_6_h_256_a_8_wnli` is a English model originally trained by C5i. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sead_l_6_h_256_a_8_wnli_en_5.5.0_3.0_1727359802702.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sead_l_6_h_256_a_8_wnli_en_5.5.0_3.0_1727359802702.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("sead_l_6_h_256_a_8_wnli","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sead_l_6_h_256_a_8_wnli", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sead_l_6_h_256_a_8_wnli| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|47.3 MB| + +## References + +https://huggingface.co/C5i/SEAD-L-6_H-256_A-8-wnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-sead_l_6_h_256_a_8_wnli_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-sead_l_6_h_256_a_8_wnli_pipeline_en.md new file mode 100644 index 00000000000000..02c1ba84a0be93 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-sead_l_6_h_256_a_8_wnli_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English sead_l_6_h_256_a_8_wnli_pipeline pipeline BertForSequenceClassification from C5i +author: John Snow Labs +name: sead_l_6_h_256_a_8_wnli_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sead_l_6_h_256_a_8_wnli_pipeline` is a English model originally trained by C5i. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sead_l_6_h_256_a_8_wnli_pipeline_en_5.5.0_3.0_1727359805590.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sead_l_6_h_256_a_8_wnli_pipeline_en_5.5.0_3.0_1727359805590.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sead_l_6_h_256_a_8_wnli_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sead_l_6_h_256_a_8_wnli_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sead_l_6_h_256_a_8_wnli_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|47.3 MB| + +## References + +https://huggingface.co/C5i/SEAD-L-6_H-256_A-8-wnli + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-sead_l_6_h_384_a_12_qqp_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-sead_l_6_h_384_a_12_qqp_pipeline_en.md new file mode 100644 index 00000000000000..8041c8eb987353 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-sead_l_6_h_384_a_12_qqp_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English sead_l_6_h_384_a_12_qqp_pipeline pipeline BertForSequenceClassification from C5i +author: John Snow Labs +name: sead_l_6_h_384_a_12_qqp_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sead_l_6_h_384_a_12_qqp_pipeline` is a English model originally trained by C5i. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sead_l_6_h_384_a_12_qqp_pipeline_en_5.5.0_3.0_1727329251582.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sead_l_6_h_384_a_12_qqp_pipeline_en_5.5.0_3.0_1727329251582.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sead_l_6_h_384_a_12_qqp_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sead_l_6_h_384_a_12_qqp_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sead_l_6_h_384_a_12_qqp_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|84.4 MB| + +## References + +https://huggingface.co/C5i/SEAD-L-6_H-384_A-12-qqp + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-sead_l_6_h_384_a_12_wnli_en.md b/docs/_posts/ahmedlone127/2024-09-26-sead_l_6_h_384_a_12_wnli_en.md new file mode 100644 index 00000000000000..1a708541e6596d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-sead_l_6_h_384_a_12_wnli_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sead_l_6_h_384_a_12_wnli BertForSequenceClassification from C5i +author: John Snow Labs +name: sead_l_6_h_384_a_12_wnli +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sead_l_6_h_384_a_12_wnli` is a English model originally trained by C5i. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sead_l_6_h_384_a_12_wnli_en_5.5.0_3.0_1727330835871.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sead_l_6_h_384_a_12_wnli_en_5.5.0_3.0_1727330835871.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("sead_l_6_h_384_a_12_wnli","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sead_l_6_h_384_a_12_wnli", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sead_l_6_h_384_a_12_wnli| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|84.2 MB| + +## References + +https://huggingface.co/C5i/SEAD-L-6_H-384_A-12-wnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-sead_l_6_h_384_a_12_wnli_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-sead_l_6_h_384_a_12_wnli_pipeline_en.md new file mode 100644 index 00000000000000..184d11c9a32f96 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-sead_l_6_h_384_a_12_wnli_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English sead_l_6_h_384_a_12_wnli_pipeline pipeline BertForSequenceClassification from C5i +author: John Snow Labs +name: sead_l_6_h_384_a_12_wnli_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sead_l_6_h_384_a_12_wnli_pipeline` is a English model originally trained by C5i. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sead_l_6_h_384_a_12_wnli_pipeline_en_5.5.0_3.0_1727330840519.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sead_l_6_h_384_a_12_wnli_pipeline_en_5.5.0_3.0_1727330840519.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sead_l_6_h_384_a_12_wnli_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sead_l_6_h_384_a_12_wnli_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sead_l_6_h_384_a_12_wnli_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|84.2 MB| + +## References + +https://huggingface.co/C5i/SEAD-L-6_H-384_A-12-wnli + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-sector_multilabel_bge_f_en.md b/docs/_posts/ahmedlone127/2024-09-26-sector_multilabel_bge_f_en.md new file mode 100644 index 00000000000000..0249b8d8d6552a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-sector_multilabel_bge_f_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sector_multilabel_bge_f BertForSequenceClassification from GIZ +author: John Snow Labs +name: sector_multilabel_bge_f +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sector_multilabel_bge_f` is a English model originally trained by GIZ. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sector_multilabel_bge_f_en_5.5.0_3.0_1727344068016.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sector_multilabel_bge_f_en_5.5.0_3.0_1727344068016.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("sector_multilabel_bge_f","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sector_multilabel_bge_f", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sector_multilabel_bge_f| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|396.5 MB| + +## References + +https://huggingface.co/GIZ/SECTOR-multilabel-bge_f \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-sentibertlarge_bn.md b/docs/_posts/ahmedlone127/2024-09-26-sentibertlarge_bn.md new file mode 100644 index 00000000000000..2d746dbff7eb01 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-sentibertlarge_bn.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Bengali sentibertlarge BertForSequenceClassification from ahnaf702 +author: John Snow Labs +name: sentibertlarge +date: 2024-09-26 +tags: [bn, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: bn +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentibertlarge` is a Bengali model originally trained by ahnaf702. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentibertlarge_bn_5.5.0_3.0_1727328148622.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentibertlarge_bn_5.5.0_3.0_1727328148622.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("sentibertlarge","bn") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sentibertlarge", "bn") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentibertlarge| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|bn| +|Size:|414.4 MB| + +## References + +https://huggingface.co/ahnaf702/SentibertLarge \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-sentibertlarge_pipeline_bn.md b/docs/_posts/ahmedlone127/2024-09-26-sentibertlarge_pipeline_bn.md new file mode 100644 index 00000000000000..67d4cb75fcabc6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-sentibertlarge_pipeline_bn.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Bengali sentibertlarge_pipeline pipeline BertForSequenceClassification from ahnaf702 +author: John Snow Labs +name: sentibertlarge_pipeline +date: 2024-09-26 +tags: [bn, open_source, pipeline, onnx] +task: Text Classification +language: bn +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentibertlarge_pipeline` is a Bengali model originally trained by ahnaf702. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentibertlarge_pipeline_bn_5.5.0_3.0_1727328169972.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentibertlarge_pipeline_bn_5.5.0_3.0_1727328169972.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sentibertlarge_pipeline", lang = "bn") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sentibertlarge_pipeline", lang = "bn") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentibertlarge_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|bn| +|Size:|414.4 MB| + +## References + +https://huggingface.co/ahnaf702/SentibertLarge + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-sentimen_analysis_yelp_en.md b/docs/_posts/ahmedlone127/2024-09-26-sentimen_analysis_yelp_en.md new file mode 100644 index 00000000000000..a86d4548f47978 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-sentimen_analysis_yelp_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sentimen_analysis_yelp BertForSequenceClassification from Imran1 +author: John Snow Labs +name: sentimen_analysis_yelp +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentimen_analysis_yelp` is a English model originally trained by Imran1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentimen_analysis_yelp_en_5.5.0_3.0_1727343158388.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentimen_analysis_yelp_en_5.5.0_3.0_1727343158388.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("sentimen_analysis_yelp","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sentimen_analysis_yelp", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentimen_analysis_yelp| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/Imran1/sentimen_analysis_yelp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-sentiment_analysis_indobertweet_en.md b/docs/_posts/ahmedlone127/2024-09-26-sentiment_analysis_indobertweet_en.md new file mode 100644 index 00000000000000..618b3ce2e8ccc6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-sentiment_analysis_indobertweet_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sentiment_analysis_indobertweet BertForSequenceClassification from ridhodaffasyah +author: John Snow Labs +name: sentiment_analysis_indobertweet +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_analysis_indobertweet` is a English model originally trained by ridhodaffasyah. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_analysis_indobertweet_en_5.5.0_3.0_1727367006359.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_analysis_indobertweet_en_5.5.0_3.0_1727367006359.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("sentiment_analysis_indobertweet","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sentiment_analysis_indobertweet", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_analysis_indobertweet| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.0 MB| + +## References + +https://huggingface.co/ridhodaffasyah/sentiment-analysis-indobertweet \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-sentiment_analysis_task_1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-sentiment_analysis_task_1_pipeline_en.md new file mode 100644 index 00000000000000..19b5e564d0378d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-sentiment_analysis_task_1_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English sentiment_analysis_task_1_pipeline pipeline BertForSequenceClassification from ABHISHEKMONU2001 +author: John Snow Labs +name: sentiment_analysis_task_1_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_analysis_task_1_pipeline` is a English model originally trained by ABHISHEKMONU2001. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_analysis_task_1_pipeline_en_5.5.0_3.0_1727317794389.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_analysis_task_1_pipeline_en_5.5.0_3.0_1727317794389.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sentiment_analysis_task_1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sentiment_analysis_task_1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_analysis_task_1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ABHISHEKMONU2001/Sentiment_Analysis_Task_1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-sentiment_model_en.md b/docs/_posts/ahmedlone127/2024-09-26-sentiment_model_en.md new file mode 100644 index 00000000000000..790e04b0b1f813 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-sentiment_model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sentiment_model BertForSequenceClassification from zanafi +author: John Snow Labs +name: sentiment_model +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_model` is a English model originally trained by zanafi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_model_en_5.5.0_3.0_1727313936118.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_model_en_5.5.0_3.0_1727313936118.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("sentiment_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sentiment_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_model| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|413.9 MB| + +## References + +https://huggingface.co/zanafi/sentiment_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-show_image_en.md b/docs/_posts/ahmedlone127/2024-09-26-show_image_en.md new file mode 100644 index 00000000000000..298d35a82dfb66 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-show_image_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English show_image BertForSequenceClassification from thanhduycao +author: John Snow Labs +name: show_image +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`show_image` is a English model originally trained by thanhduycao. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/show_image_en_5.5.0_3.0_1727309111971.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/show_image_en_5.5.0_3.0_1727309111971.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("show_image","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("show_image", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|show_image| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|75.1 MB| + +## References + +https://huggingface.co/thanhduycao/show_image \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-show_image_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-show_image_pipeline_en.md new file mode 100644 index 00000000000000..ebf0c7fe7ef102 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-show_image_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English show_image_pipeline pipeline BertForSequenceClassification from thanhduycao +author: John Snow Labs +name: show_image_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`show_image_pipeline` is a English model originally trained by thanhduycao. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/show_image_pipeline_en_5.5.0_3.0_1727309119931.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/show_image_pipeline_en_5.5.0_3.0_1727309119931.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("show_image_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("show_image_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|show_image_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|75.1 MB| + +## References + +https://huggingface.co/thanhduycao/show_image + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-simple_classification_en.md b/docs/_posts/ahmedlone127/2024-09-26-simple_classification_en.md new file mode 100644 index 00000000000000..2ddb4d5c146495 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-simple_classification_en.md @@ -0,0 +1,98 @@ +--- +layout: model +title: English simple_classification DistilBertForSequenceClassification from ai-ar +author: John Snow Labs +name: simple_classification +date: 2024-09-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`simple_classification` is a English model originally trained by ai-ar. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/simple_classification_en_5.5.0_3.0_1727342440690.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/simple_classification_en_5.5.0_3.0_1727342440690.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("simple_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("simple_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|simple_classification| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +References + +https://huggingface.co/ai-ar/simple-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-sinhala_albert_en.md b/docs/_posts/ahmedlone127/2024-09-26-sinhala_albert_en.md new file mode 100644 index 00000000000000..2286efed7893ff --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-sinhala_albert_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sinhala_albert BertForSequenceClassification from theekshana +author: John Snow Labs +name: sinhala_albert +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sinhala_albert` is a English model originally trained by theekshana. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sinhala_albert_en_5.5.0_3.0_1727330465188.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sinhala_albert_en_5.5.0_3.0_1727330465188.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("sinhala_albert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sinhala_albert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sinhala_albert| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|44.2 MB| + +## References + +https://huggingface.co/theekshana/sinhala_albert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-sinhala_albert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-sinhala_albert_pipeline_en.md new file mode 100644 index 00000000000000..e7f3dd2af8f314 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-sinhala_albert_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English sinhala_albert_pipeline pipeline BertForSequenceClassification from theekshana +author: John Snow Labs +name: sinhala_albert_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sinhala_albert_pipeline` is a English model originally trained by theekshana. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sinhala_albert_pipeline_en_5.5.0_3.0_1727330467617.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sinhala_albert_pipeline_en_5.5.0_3.0_1727330467617.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sinhala_albert_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sinhala_albert_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sinhala_albert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|44.2 MB| + +## References + +https://huggingface.co/theekshana/sinhala_albert + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-snli_w_premise_100k_en.md b/docs/_posts/ahmedlone127/2024-09-26-snli_w_premise_100k_en.md new file mode 100644 index 00000000000000..eb6d2b3fb43cbd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-snli_w_premise_100k_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English snli_w_premise_100k BertForSequenceClassification from grace-pro +author: John Snow Labs +name: snli_w_premise_100k +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`snli_w_premise_100k` is a English model originally trained by grace-pro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/snli_w_premise_100k_en_5.5.0_3.0_1727340177691.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/snli_w_premise_100k_en_5.5.0_3.0_1727340177691.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("snli_w_premise_100k","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("snli_w_premise_100k", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|snli_w_premise_100k| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/grace-pro/snli_w_premise_100k \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-ssc_bert_en.md b/docs/_posts/ahmedlone127/2024-09-26-ssc_bert_en.md new file mode 100644 index 00000000000000..c1891cdeecedbc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-ssc_bert_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English ssc_bert BertForSequenceClassification from rasoultilburg +author: John Snow Labs +name: ssc_bert +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ssc_bert` is a English model originally trained by rasoultilburg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ssc_bert_en_5.5.0_3.0_1727319763318.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ssc_bert_en_5.5.0_3.0_1727319763318.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("ssc_bert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("ssc_bert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ssc_bert| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/rasoultilburg/ssc_bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-ssc_bert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-ssc_bert_pipeline_en.md new file mode 100644 index 00000000000000..962cbc581d7b8b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-ssc_bert_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English ssc_bert_pipeline pipeline BertForSequenceClassification from rasoultilburg +author: John Snow Labs +name: ssc_bert_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ssc_bert_pipeline` is a English model originally trained by rasoultilburg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ssc_bert_pipeline_en_5.5.0_3.0_1727319784067.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ssc_bert_pipeline_en_5.5.0_3.0_1727319784067.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("ssc_bert_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("ssc_bert_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ssc_bert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/rasoultilburg/ssc_bert + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-ssml_bert_tunedmodel_en.md b/docs/_posts/ahmedlone127/2024-09-26-ssml_bert_tunedmodel_en.md new file mode 100644 index 00000000000000..0b4eaabcbe9278 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-ssml_bert_tunedmodel_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English ssml_bert_tunedmodel BertForSequenceClassification from ssml2050 +author: John Snow Labs +name: ssml_bert_tunedmodel +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ssml_bert_tunedmodel` is a English model originally trained by ssml2050. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ssml_bert_tunedmodel_en_5.5.0_3.0_1727342867204.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ssml_bert_tunedmodel_en_5.5.0_3.0_1727342867204.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("ssml_bert_tunedmodel","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("ssml_bert_tunedmodel", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ssml_bert_tunedmodel| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ssml2050/ssml_bert_tunedmodel \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-ssml_bert_tunedmodel_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-ssml_bert_tunedmodel_pipeline_en.md new file mode 100644 index 00000000000000..d07e466199dc68 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-ssml_bert_tunedmodel_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English ssml_bert_tunedmodel_pipeline pipeline BertForSequenceClassification from ssml2050 +author: John Snow Labs +name: ssml_bert_tunedmodel_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ssml_bert_tunedmodel_pipeline` is a English model originally trained by ssml2050. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ssml_bert_tunedmodel_pipeline_en_5.5.0_3.0_1727342888467.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ssml_bert_tunedmodel_pipeline_en_5.5.0_3.0_1727342888467.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("ssml_bert_tunedmodel_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("ssml_bert_tunedmodel_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ssml_bert_tunedmodel_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ssml2050/ssml_bert_tunedmodel + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-statement_equivalence_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-statement_equivalence_pipeline_en.md new file mode 100644 index 00000000000000..44dd2026c5f365 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-statement_equivalence_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English statement_equivalence_pipeline pipeline BertForSequenceClassification from MattStammers +author: John Snow Labs +name: statement_equivalence_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`statement_equivalence_pipeline` is a English model originally trained by MattStammers. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/statement_equivalence_pipeline_en_5.5.0_3.0_1727342271526.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/statement_equivalence_pipeline_en_5.5.0_3.0_1727342271526.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("statement_equivalence_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("statement_equivalence_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|statement_equivalence_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/MattStammers/Statement_Equivalence + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-stock_market_news_classification_en.md b/docs/_posts/ahmedlone127/2024-09-26-stock_market_news_classification_en.md new file mode 100644 index 00000000000000..1659c39a06f076 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-stock_market_news_classification_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English stock_market_news_classification BertForSequenceClassification from XA7 +author: John Snow Labs +name: stock_market_news_classification +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`stock_market_news_classification` is a English model originally trained by XA7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/stock_market_news_classification_en_5.5.0_3.0_1727354433392.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/stock_market_news_classification_en_5.5.0_3.0_1727354433392.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("stock_market_news_classification","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("stock_market_news_classification", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|stock_market_news_classification| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/XA7/Stock-market-news-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-stock_market_news_classification_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-stock_market_news_classification_pipeline_en.md new file mode 100644 index 00000000000000..3c499e771c5724 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-stock_market_news_classification_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English stock_market_news_classification_pipeline pipeline BertForSequenceClassification from XA7 +author: John Snow Labs +name: stock_market_news_classification_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`stock_market_news_classification_pipeline` is a English model originally trained by XA7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/stock_market_news_classification_pipeline_en_5.5.0_3.0_1727354454697.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/stock_market_news_classification_pipeline_en_5.5.0_3.0_1727354454697.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("stock_market_news_classification_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("stock_market_news_classification_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|stock_market_news_classification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/XA7/Stock-market-news-classification + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-stsb_tinybert_l_4_finetuned_auc_151221_5_001_en.md b/docs/_posts/ahmedlone127/2024-09-26-stsb_tinybert_l_4_finetuned_auc_151221_5_001_en.md new file mode 100644 index 00000000000000..c92fe342b298f9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-stsb_tinybert_l_4_finetuned_auc_151221_5_001_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English stsb_tinybert_l_4_finetuned_auc_151221_5_001 BertForSequenceClassification from Katsiaryna +author: John Snow Labs +name: stsb_tinybert_l_4_finetuned_auc_151221_5_001 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`stsb_tinybert_l_4_finetuned_auc_151221_5_001` is a English model originally trained by Katsiaryna. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/stsb_tinybert_l_4_finetuned_auc_151221_5_001_en_5.5.0_3.0_1727318212149.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/stsb_tinybert_l_4_finetuned_auc_151221_5_001_en_5.5.0_3.0_1727318212149.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("stsb_tinybert_l_4_finetuned_auc_151221_5_001","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("stsb_tinybert_l_4_finetuned_auc_151221_5_001", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|stsb_tinybert_l_4_finetuned_auc_151221_5_001| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|54.2 MB| + +## References + +https://huggingface.co/Katsiaryna/stsb-TinyBERT-L-4-finetuned_auc_151221-5-001 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-stsb_tinybert_l_4_finetuned_auc_151221_5_001_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-stsb_tinybert_l_4_finetuned_auc_151221_5_001_pipeline_en.md new file mode 100644 index 00000000000000..9d8dbacd10ad59 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-stsb_tinybert_l_4_finetuned_auc_151221_5_001_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English stsb_tinybert_l_4_finetuned_auc_151221_5_001_pipeline pipeline BertForSequenceClassification from Katsiaryna +author: John Snow Labs +name: stsb_tinybert_l_4_finetuned_auc_151221_5_001_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`stsb_tinybert_l_4_finetuned_auc_151221_5_001_pipeline` is a English model originally trained by Katsiaryna. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/stsb_tinybert_l_4_finetuned_auc_151221_5_001_pipeline_en_5.5.0_3.0_1727318214939.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/stsb_tinybert_l_4_finetuned_auc_151221_5_001_pipeline_en_5.5.0_3.0_1727318214939.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("stsb_tinybert_l_4_finetuned_auc_151221_5_001_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("stsb_tinybert_l_4_finetuned_auc_151221_5_001_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|stsb_tinybert_l_4_finetuned_auc_151221_5_001_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|54.2 MB| + +## References + +https://huggingface.co/Katsiaryna/stsb-TinyBERT-L-4-finetuned_auc_151221-5-001 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-stsb_tinybert_l_4_finetuned_auc_151221_top1_en.md b/docs/_posts/ahmedlone127/2024-09-26-stsb_tinybert_l_4_finetuned_auc_151221_top1_en.md new file mode 100644 index 00000000000000..35e4d5015dcc11 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-stsb_tinybert_l_4_finetuned_auc_151221_top1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English stsb_tinybert_l_4_finetuned_auc_151221_top1 BertForSequenceClassification from Katsiaryna +author: John Snow Labs +name: stsb_tinybert_l_4_finetuned_auc_151221_top1 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`stsb_tinybert_l_4_finetuned_auc_151221_top1` is a English model originally trained by Katsiaryna. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/stsb_tinybert_l_4_finetuned_auc_151221_top1_en_5.5.0_3.0_1727316360921.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/stsb_tinybert_l_4_finetuned_auc_151221_top1_en_5.5.0_3.0_1727316360921.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("stsb_tinybert_l_4_finetuned_auc_151221_top1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("stsb_tinybert_l_4_finetuned_auc_151221_top1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|stsb_tinybert_l_4_finetuned_auc_151221_top1| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|54.2 MB| + +## References + +https://huggingface.co/Katsiaryna/stsb-TinyBERT-L-4-finetuned_auc_151221-top1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-stsb_tinybert_l_4_finetuned_auc_151221_top1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-stsb_tinybert_l_4_finetuned_auc_151221_top1_pipeline_en.md new file mode 100644 index 00000000000000..8bc6f7b1334e1f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-stsb_tinybert_l_4_finetuned_auc_151221_top1_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English stsb_tinybert_l_4_finetuned_auc_151221_top1_pipeline pipeline BertForSequenceClassification from Katsiaryna +author: John Snow Labs +name: stsb_tinybert_l_4_finetuned_auc_151221_top1_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`stsb_tinybert_l_4_finetuned_auc_151221_top1_pipeline` is a English model originally trained by Katsiaryna. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/stsb_tinybert_l_4_finetuned_auc_151221_top1_pipeline_en_5.5.0_3.0_1727316363842.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/stsb_tinybert_l_4_finetuned_auc_151221_top1_pipeline_en_5.5.0_3.0_1727316363842.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("stsb_tinybert_l_4_finetuned_auc_151221_top1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("stsb_tinybert_l_4_finetuned_auc_151221_top1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|stsb_tinybert_l_4_finetuned_auc_151221_top1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|54.2 MB| + +## References + +https://huggingface.co/Katsiaryna/stsb-TinyBERT-L-4-finetuned_auc_151221-top1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-stsb_tinybert_l_4_finetuned_auc_en.md b/docs/_posts/ahmedlone127/2024-09-26-stsb_tinybert_l_4_finetuned_auc_en.md new file mode 100644 index 00000000000000..fb0b8d8a15669e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-stsb_tinybert_l_4_finetuned_auc_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English stsb_tinybert_l_4_finetuned_auc BertForSequenceClassification from Katsiaryna +author: John Snow Labs +name: stsb_tinybert_l_4_finetuned_auc +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`stsb_tinybert_l_4_finetuned_auc` is a English model originally trained by Katsiaryna. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/stsb_tinybert_l_4_finetuned_auc_en_5.5.0_3.0_1727316484241.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/stsb_tinybert_l_4_finetuned_auc_en_5.5.0_3.0_1727316484241.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("stsb_tinybert_l_4_finetuned_auc","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("stsb_tinybert_l_4_finetuned_auc", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|stsb_tinybert_l_4_finetuned_auc| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|54.2 MB| + +## References + +https://huggingface.co/Katsiaryna/stsb-TinyBERT-L-4-finetuned_auc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-stud_fac_eval_bert_base_uncased_pipeline_tl.md b/docs/_posts/ahmedlone127/2024-09-26-stud_fac_eval_bert_base_uncased_pipeline_tl.md new file mode 100644 index 00000000000000..f1c492cf861b96 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-stud_fac_eval_bert_base_uncased_pipeline_tl.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Tagalog stud_fac_eval_bert_base_uncased_pipeline pipeline BertForSequenceClassification from MENG21 +author: John Snow Labs +name: stud_fac_eval_bert_base_uncased_pipeline +date: 2024-09-26 +tags: [tl, open_source, pipeline, onnx] +task: Text Classification +language: tl +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`stud_fac_eval_bert_base_uncased_pipeline` is a Tagalog model originally trained by MENG21. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/stud_fac_eval_bert_base_uncased_pipeline_tl_5.5.0_3.0_1727344298744.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/stud_fac_eval_bert_base_uncased_pipeline_tl_5.5.0_3.0_1727344298744.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("stud_fac_eval_bert_base_uncased_pipeline", lang = "tl") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("stud_fac_eval_bert_base_uncased_pipeline", lang = "tl") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|stud_fac_eval_bert_base_uncased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|tl| +|Size:|409.4 MB| + +## References + +https://huggingface.co/MENG21/stud-fac-eval-bert-base-uncased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-task_implicit_task__model_hatebert__aug_method_all_en.md b/docs/_posts/ahmedlone127/2024-09-26-task_implicit_task__model_hatebert__aug_method_all_en.md new file mode 100644 index 00000000000000..758f56e864b9d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-task_implicit_task__model_hatebert__aug_method_all_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English task_implicit_task__model_hatebert__aug_method_all BertForSequenceClassification from BenjaminOcampo +author: John Snow Labs +name: task_implicit_task__model_hatebert__aug_method_all +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`task_implicit_task__model_hatebert__aug_method_all` is a English model originally trained by BenjaminOcampo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/task_implicit_task__model_hatebert__aug_method_all_en_5.5.0_3.0_1727344799013.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/task_implicit_task__model_hatebert__aug_method_all_en_5.5.0_3.0_1727344799013.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("task_implicit_task__model_hatebert__aug_method_all","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("task_implicit_task__model_hatebert__aug_method_all", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|task_implicit_task__model_hatebert__aug_method_all| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.3 MB| + +## References + +https://huggingface.co/BenjaminOcampo/task-implicit_task__model-hatebert__aug_method-all \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-tesla_news_title_sentiment_analysis_en.md b/docs/_posts/ahmedlone127/2024-09-26-tesla_news_title_sentiment_analysis_en.md new file mode 100644 index 00000000000000..b085a91dd59ac4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-tesla_news_title_sentiment_analysis_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English tesla_news_title_sentiment_analysis BertForSequenceClassification from YC9Z +author: John Snow Labs +name: tesla_news_title_sentiment_analysis +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tesla_news_title_sentiment_analysis` is a English model originally trained by YC9Z. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tesla_news_title_sentiment_analysis_en_5.5.0_3.0_1727316648901.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tesla_news_title_sentiment_analysis_en_5.5.0_3.0_1727316648901.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("tesla_news_title_sentiment_analysis","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("tesla_news_title_sentiment_analysis", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tesla_news_title_sentiment_analysis| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/YC9Z/tesla_news_title_sentiment_analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-test_bert_base_banking77_en.md b/docs/_posts/ahmedlone127/2024-09-26-test_bert_base_banking77_en.md new file mode 100644 index 00000000000000..e6c7a8f039c9ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-test_bert_base_banking77_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English test_bert_base_banking77 BertForSequenceClassification from Kirie +author: John Snow Labs +name: test_bert_base_banking77 +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_bert_base_banking77` is a English model originally trained by Kirie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_bert_base_banking77_en_5.5.0_3.0_1727355029595.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_bert_base_banking77_en_5.5.0_3.0_1727355029595.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("test_bert_base_banking77","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("test_bert_base_banking77", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_bert_base_banking77| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/Kirie/test-bert-base-banking77 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-test_bert_base_banking77_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-test_bert_base_banking77_pipeline_en.md new file mode 100644 index 00000000000000..76d1893769295f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-test_bert_base_banking77_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English test_bert_base_banking77_pipeline pipeline BertForSequenceClassification from Kirie +author: John Snow Labs +name: test_bert_base_banking77_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_bert_base_banking77_pipeline` is a English model originally trained by Kirie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_bert_base_banking77_pipeline_en_5.5.0_3.0_1727355052115.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_bert_base_banking77_pipeline_en_5.5.0_3.0_1727355052115.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("test_bert_base_banking77_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("test_bert_base_banking77_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_bert_base_banking77_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/Kirie/test-bert-base-banking77 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-test_trainer_2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-test_trainer_2_pipeline_en.md new file mode 100644 index 00000000000000..4113ac63771d5f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-test_trainer_2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English test_trainer_2_pipeline pipeline BertForSequenceClassification from AleRams +author: John Snow Labs +name: test_trainer_2_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_trainer_2_pipeline` is a English model originally trained by AleRams. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_trainer_2_pipeline_en_5.5.0_3.0_1727319754191.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_trainer_2_pipeline_en_5.5.0_3.0_1727319754191.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("test_trainer_2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("test_trainer_2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_trainer_2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/AleRams/test-trainer_2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-test_trainer_dongxiaoxia194_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-test_trainer_dongxiaoxia194_pipeline_en.md new file mode 100644 index 00000000000000..e9bffe3f945ce9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-test_trainer_dongxiaoxia194_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English test_trainer_dongxiaoxia194_pipeline pipeline BertForSequenceClassification from dongxiaoxia194 +author: John Snow Labs +name: test_trainer_dongxiaoxia194_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_trainer_dongxiaoxia194_pipeline` is a English model originally trained by dongxiaoxia194. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_trainer_dongxiaoxia194_pipeline_en_5.5.0_3.0_1727353402765.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_trainer_dongxiaoxia194_pipeline_en_5.5.0_3.0_1727353402765.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("test_trainer_dongxiaoxia194_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("test_trainer_dongxiaoxia194_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_trainer_dongxiaoxia194_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/dongxiaoxia194/test-trainer + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-test_trainer_xyz123xyz_en.md b/docs/_posts/ahmedlone127/2024-09-26-test_trainer_xyz123xyz_en.md new file mode 100644 index 00000000000000..2a9fc5f6e567e0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-test_trainer_xyz123xyz_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English test_trainer_xyz123xyz BertForSequenceClassification from XYZ123XYZ +author: John Snow Labs +name: test_trainer_xyz123xyz +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_trainer_xyz123xyz` is a English model originally trained by XYZ123XYZ. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_trainer_xyz123xyz_en_5.5.0_3.0_1727353523243.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_trainer_xyz123xyz_en_5.5.0_3.0_1727353523243.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("test_trainer_xyz123xyz","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("test_trainer_xyz123xyz", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_trainer_xyz123xyz| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/XYZ123XYZ/test-trainer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-text_classification_bert_base_uncased_en.md b/docs/_posts/ahmedlone127/2024-09-26-text_classification_bert_base_uncased_en.md new file mode 100644 index 00000000000000..486fd1d765f760 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-text_classification_bert_base_uncased_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English text_classification_bert_base_uncased BertForSequenceClassification from Cynthiaiii4 +author: John Snow Labs +name: text_classification_bert_base_uncased +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`text_classification_bert_base_uncased` is a English model originally trained by Cynthiaiii4. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/text_classification_bert_base_uncased_en_5.5.0_3.0_1727351679995.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/text_classification_bert_base_uncased_en_5.5.0_3.0_1727351679995.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("text_classification_bert_base_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("text_classification_bert_base_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|text_classification_bert_base_uncased| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Cynthiaiii4/Text_classification_bert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-text_classify_model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-text_classify_model_pipeline_en.md new file mode 100644 index 00000000000000..6ae24ebc96d3a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-text_classify_model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English text_classify_model_pipeline pipeline BertForSequenceClassification from hydrochii +author: John Snow Labs +name: text_classify_model_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`text_classify_model_pipeline` is a English model originally trained by hydrochii. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/text_classify_model_pipeline_en_5.5.0_3.0_1727313314914.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/text_classify_model_pipeline_en_5.5.0_3.0_1727313314914.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("text_classify_model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("text_classify_model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|text_classify_model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/hydrochii/text_classify_model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-thext_pce_bio_en.md b/docs/_posts/ahmedlone127/2024-09-26-thext_pce_bio_en.md new file mode 100644 index 00000000000000..952f0567ee23b3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-thext_pce_bio_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English thext_pce_bio BertForSequenceClassification from pietrocagnasso +author: John Snow Labs +name: thext_pce_bio +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`thext_pce_bio` is a English model originally trained by pietrocagnasso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/thext_pce_bio_en_5.5.0_3.0_1727347193219.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/thext_pce_bio_en_5.5.0_3.0_1727347193219.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("thext_pce_bio","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("thext_pce_bio", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|thext_pce_bio| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|412.1 MB| + +## References + +https://huggingface.co/pietrocagnasso/thext-pce-bio \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-thext_pce_bio_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-thext_pce_bio_pipeline_en.md new file mode 100644 index 00000000000000..a05a29726ea62e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-thext_pce_bio_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English thext_pce_bio_pipeline pipeline BertForSequenceClassification from pietrocagnasso +author: John Snow Labs +name: thext_pce_bio_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`thext_pce_bio_pipeline` is a English model originally trained by pietrocagnasso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/thext_pce_bio_pipeline_en_5.5.0_3.0_1727347216255.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/thext_pce_bio_pipeline_en_5.5.0_3.0_1727347216255.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("thext_pce_bio_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("thext_pce_bio_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|thext_pce_bio_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|412.1 MB| + +## References + +https://huggingface.co/pietrocagnasso/thext-pce-bio + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-tibetan_bert_tncc_title_tsheg_en.md b/docs/_posts/ahmedlone127/2024-09-26-tibetan_bert_tncc_title_tsheg_en.md new file mode 100644 index 00000000000000..727f84116fc6f6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-tibetan_bert_tncc_title_tsheg_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English tibetan_bert_tncc_title_tsheg BertForSequenceClassification from UTibetNLP +author: John Snow Labs +name: tibetan_bert_tncc_title_tsheg +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tibetan_bert_tncc_title_tsheg` is a English model originally trained by UTibetNLP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tibetan_bert_tncc_title_tsheg_en_5.5.0_3.0_1727352876367.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tibetan_bert_tncc_title_tsheg_en_5.5.0_3.0_1727352876367.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("tibetan_bert_tncc_title_tsheg","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("tibetan_bert_tncc_title_tsheg", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tibetan_bert_tncc_title_tsheg| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.8 MB| + +## References + +https://huggingface.co/UTibetNLP/tibetan-bert_tncc-title_tsheg \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-tiny_bert_30_intents_en.md b/docs/_posts/ahmedlone127/2024-09-26-tiny_bert_30_intents_en.md new file mode 100644 index 00000000000000..fafaf473e598e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-tiny_bert_30_intents_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English tiny_bert_30_intents BertForSequenceClassification from m-aliabbas1 +author: John Snow Labs +name: tiny_bert_30_intents +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_bert_30_intents` is a English model originally trained by m-aliabbas1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_bert_30_intents_en_5.5.0_3.0_1727357217678.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_bert_30_intents_en_5.5.0_3.0_1727357217678.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("tiny_bert_30_intents","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("tiny_bert_30_intents", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_bert_30_intents| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/m-aliabbas1/tiny_bert_30_intents \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-tiny_bert_30_intents_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-tiny_bert_30_intents_pipeline_en.md new file mode 100644 index 00000000000000..5b646c4577fe68 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-tiny_bert_30_intents_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English tiny_bert_30_intents_pipeline pipeline BertForSequenceClassification from m-aliabbas1 +author: John Snow Labs +name: tiny_bert_30_intents_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_bert_30_intents_pipeline` is a English model originally trained by m-aliabbas1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_bert_30_intents_pipeline_en_5.5.0_3.0_1727357218835.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_bert_30_intents_pipeline_en_5.5.0_3.0_1727357218835.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("tiny_bert_30_intents_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("tiny_bert_30_intents_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_bert_30_intents_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|16.8 MB| + +## References + +https://huggingface.co/m-aliabbas1/tiny_bert_30_intents + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-tiny_bert_cupstone_en.md b/docs/_posts/ahmedlone127/2024-09-26-tiny_bert_cupstone_en.md new file mode 100644 index 00000000000000..584df5e900ea42 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-tiny_bert_cupstone_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English tiny_bert_cupstone BertForSequenceClassification from petermutwiri +author: John Snow Labs +name: tiny_bert_cupstone +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_bert_cupstone` is a English model originally trained by petermutwiri. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_bert_cupstone_en_5.5.0_3.0_1727342955802.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_bert_cupstone_en_5.5.0_3.0_1727342955802.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("tiny_bert_cupstone","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("tiny_bert_cupstone", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_bert_cupstone| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|54.2 MB| + +## References + +https://huggingface.co/petermutwiri/Tiny_Bert_Cupstone \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-tiny_bert_cupstone_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-tiny_bert_cupstone_pipeline_en.md new file mode 100644 index 00000000000000..e2a1ae6a50b76b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-tiny_bert_cupstone_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English tiny_bert_cupstone_pipeline pipeline BertForSequenceClassification from petermutwiri +author: John Snow Labs +name: tiny_bert_cupstone_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_bert_cupstone_pipeline` is a English model originally trained by petermutwiri. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_bert_cupstone_pipeline_en_5.5.0_3.0_1727342958915.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_bert_cupstone_pipeline_en_5.5.0_3.0_1727342958915.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("tiny_bert_cupstone_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("tiny_bert_cupstone_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_bert_cupstone_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|54.2 MB| + +## References + +https://huggingface.co/petermutwiri/Tiny_Bert_Cupstone + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-tiny_random_debertaforsequenceclassification_ydshieh_en.md b/docs/_posts/ahmedlone127/2024-09-26-tiny_random_debertaforsequenceclassification_ydshieh_en.md new file mode 100644 index 00000000000000..5bc762b12ff091 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-tiny_random_debertaforsequenceclassification_ydshieh_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English tiny_random_debertaforsequenceclassification_ydshieh BertForSequenceClassification from ydshieh +author: John Snow Labs +name: tiny_random_debertaforsequenceclassification_ydshieh +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_random_debertaforsequenceclassification_ydshieh` is a English model originally trained by ydshieh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_random_debertaforsequenceclassification_ydshieh_en_5.5.0_3.0_1727337442071.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_random_debertaforsequenceclassification_ydshieh_en_5.5.0_3.0_1727337442071.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("tiny_random_debertaforsequenceclassification_ydshieh","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("tiny_random_debertaforsequenceclassification_ydshieh", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_random_debertaforsequenceclassification_ydshieh| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|350.2 KB| + +## References + +https://huggingface.co/ydshieh/tiny-random-DebertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-tiny_random_debertaforsequenceclassification_ydshieh_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-tiny_random_debertaforsequenceclassification_ydshieh_pipeline_en.md new file mode 100644 index 00000000000000..d211cf19c5d7b6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-tiny_random_debertaforsequenceclassification_ydshieh_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English tiny_random_debertaforsequenceclassification_ydshieh_pipeline pipeline BertForSequenceClassification from ydshieh +author: John Snow Labs +name: tiny_random_debertaforsequenceclassification_ydshieh_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_random_debertaforsequenceclassification_ydshieh_pipeline` is a English model originally trained by ydshieh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_random_debertaforsequenceclassification_ydshieh_pipeline_en_5.5.0_3.0_1727337442451.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_random_debertaforsequenceclassification_ydshieh_pipeline_en_5.5.0_3.0_1727337442451.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("tiny_random_debertaforsequenceclassification_ydshieh_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("tiny_random_debertaforsequenceclassification_ydshieh_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_random_debertaforsequenceclassification_ydshieh_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|373.0 KB| + +## References + +https://huggingface.co/ydshieh/tiny-random-DebertaForSequenceClassification + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-topic_abstract_classification_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-topic_abstract_classification_pipeline_en.md new file mode 100644 index 00000000000000..55682d02492287 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-topic_abstract_classification_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English topic_abstract_classification_pipeline pipeline BertForSequenceClassification from Eitanli +author: John Snow Labs +name: topic_abstract_classification_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`topic_abstract_classification_pipeline` is a English model originally trained by Eitanli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/topic_abstract_classification_pipeline_en_5.5.0_3.0_1727312820125.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/topic_abstract_classification_pipeline_en_5.5.0_3.0_1727312820125.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("topic_abstract_classification_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("topic_abstract_classification_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|topic_abstract_classification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Eitanli/topic_abstract_classification + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-toxic_comment_classification_pipeline_pt.md b/docs/_posts/ahmedlone127/2024-09-26-toxic_comment_classification_pipeline_pt.md new file mode 100644 index 00000000000000..1c7ca5e2a65ec0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-toxic_comment_classification_pipeline_pt.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Portuguese toxic_comment_classification_pipeline pipeline BertForSequenceClassification from dougtrajano +author: John Snow Labs +name: toxic_comment_classification_pipeline +date: 2024-09-26 +tags: [pt, open_source, pipeline, onnx] +task: Text Classification +language: pt +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`toxic_comment_classification_pipeline` is a Portuguese model originally trained by dougtrajano. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/toxic_comment_classification_pipeline_pt_5.5.0_3.0_1727311086841.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/toxic_comment_classification_pipeline_pt_5.5.0_3.0_1727311086841.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("toxic_comment_classification_pipeline", lang = "pt") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("toxic_comment_classification_pipeline", lang = "pt") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|toxic_comment_classification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|pt| +|Size:|1.3 GB| + +## References + +https://huggingface.co/dougtrajano/toxic-comment-classification + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-toxic_comment_classification_pt.md b/docs/_posts/ahmedlone127/2024-09-26-toxic_comment_classification_pt.md new file mode 100644 index 00000000000000..c9105a03231268 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-toxic_comment_classification_pt.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Portuguese toxic_comment_classification BertForSequenceClassification from dougtrajano +author: John Snow Labs +name: toxic_comment_classification +date: 2024-09-26 +tags: [pt, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: pt +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`toxic_comment_classification` is a Portuguese model originally trained by dougtrajano. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/toxic_comment_classification_pt_5.5.0_3.0_1727311017512.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/toxic_comment_classification_pt_5.5.0_3.0_1727311017512.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("toxic_comment_classification","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("toxic_comment_classification", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|toxic_comment_classification| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|pt| +|Size:|1.3 GB| + +## References + +https://huggingface.co/dougtrajano/toxic-comment-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-toxic_dbmdz_bert_base_turkish_128k_uncased_en.md b/docs/_posts/ahmedlone127/2024-09-26-toxic_dbmdz_bert_base_turkish_128k_uncased_en.md new file mode 100644 index 00000000000000..e31a6ac23abad3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-toxic_dbmdz_bert_base_turkish_128k_uncased_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English toxic_dbmdz_bert_base_turkish_128k_uncased BertForSequenceClassification from l2reg +author: John Snow Labs +name: toxic_dbmdz_bert_base_turkish_128k_uncased +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`toxic_dbmdz_bert_base_turkish_128k_uncased` is a English model originally trained by l2reg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/toxic_dbmdz_bert_base_turkish_128k_uncased_en_5.5.0_3.0_1727311027072.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/toxic_dbmdz_bert_base_turkish_128k_uncased_en_5.5.0_3.0_1727311027072.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("toxic_dbmdz_bert_base_turkish_128k_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("toxic_dbmdz_bert_base_turkish_128k_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|toxic_dbmdz_bert_base_turkish_128k_uncased| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|691.6 MB| + +## References + +https://huggingface.co/l2reg/toxic-dbmdz-bert-base-turkish-128k-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-toxic_dbmdz_bert_base_turkish_128k_uncased_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-toxic_dbmdz_bert_base_turkish_128k_uncased_pipeline_en.md new file mode 100644 index 00000000000000..0249c84bbf0220 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-toxic_dbmdz_bert_base_turkish_128k_uncased_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English toxic_dbmdz_bert_base_turkish_128k_uncased_pipeline pipeline BertForSequenceClassification from l2reg +author: John Snow Labs +name: toxic_dbmdz_bert_base_turkish_128k_uncased_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`toxic_dbmdz_bert_base_turkish_128k_uncased_pipeline` is a English model originally trained by l2reg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/toxic_dbmdz_bert_base_turkish_128k_uncased_pipeline_en_5.5.0_3.0_1727311068189.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/toxic_dbmdz_bert_base_turkish_128k_uncased_pipeline_en_5.5.0_3.0_1727311068189.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("toxic_dbmdz_bert_base_turkish_128k_uncased_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("toxic_dbmdz_bert_base_turkish_128k_uncased_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|toxic_dbmdz_bert_base_turkish_128k_uncased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|691.6 MB| + +## References + +https://huggingface.co/l2reg/toxic-dbmdz-bert-base-turkish-128k-uncased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-toxicity_target_type_identification_pt.md b/docs/_posts/ahmedlone127/2024-09-26-toxicity_target_type_identification_pt.md new file mode 100644 index 00000000000000..167f49a1c47c40 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-toxicity_target_type_identification_pt.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Portuguese toxicity_target_type_identification BertForSequenceClassification from dougtrajano +author: John Snow Labs +name: toxicity_target_type_identification +date: 2024-09-26 +tags: [pt, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: pt +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`toxicity_target_type_identification` is a Portuguese model originally trained by dougtrajano. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/toxicity_target_type_identification_pt_5.5.0_3.0_1727314228150.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/toxicity_target_type_identification_pt_5.5.0_3.0_1727314228150.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("toxicity_target_type_identification","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("toxicity_target_type_identification", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|toxicity_target_type_identification| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|pt| +|Size:|1.3 GB| + +## References + +https://huggingface.co/dougtrajano/toxicity-target-type-identification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-trac2020_eng_a_bert_base_uncased_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-trac2020_eng_a_bert_base_uncased_pipeline_en.md new file mode 100644 index 00000000000000..4f7970b10f1930 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-trac2020_eng_a_bert_base_uncased_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English trac2020_eng_a_bert_base_uncased_pipeline pipeline BertForSequenceClassification from socialmediaie +author: John Snow Labs +name: trac2020_eng_a_bert_base_uncased_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`trac2020_eng_a_bert_base_uncased_pipeline` is a English model originally trained by socialmediaie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/trac2020_eng_a_bert_base_uncased_pipeline_en_5.5.0_3.0_1727311569080.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/trac2020_eng_a_bert_base_uncased_pipeline_en_5.5.0_3.0_1727311569080.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("trac2020_eng_a_bert_base_uncased_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("trac2020_eng_a_bert_base_uncased_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|trac2020_eng_a_bert_base_uncased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/socialmediaie/TRAC2020_ENG_A_bert-base-uncased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-trac2020_hin_b_bert_base_multilingual_uncased_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-09-26-trac2020_hin_b_bert_base_multilingual_uncased_pipeline_xx.md new file mode 100644 index 00000000000000..43023aade40b58 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-trac2020_hin_b_bert_base_multilingual_uncased_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual trac2020_hin_b_bert_base_multilingual_uncased_pipeline pipeline BertForSequenceClassification from socialmediaie +author: John Snow Labs +name: trac2020_hin_b_bert_base_multilingual_uncased_pipeline +date: 2024-09-26 +tags: [xx, open_source, pipeline, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`trac2020_hin_b_bert_base_multilingual_uncased_pipeline` is a Multilingual model originally trained by socialmediaie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/trac2020_hin_b_bert_base_multilingual_uncased_pipeline_xx_5.5.0_3.0_1727341890779.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/trac2020_hin_b_bert_base_multilingual_uncased_pipeline_xx_5.5.0_3.0_1727341890779.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("trac2020_hin_b_bert_base_multilingual_uncased_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("trac2020_hin_b_bert_base_multilingual_uncased_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|trac2020_hin_b_bert_base_multilingual_uncased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|627.8 MB| + +## References + +https://huggingface.co/socialmediaie/TRAC2020_HIN_B_bert-base-multilingual-uncased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-trac2020_hin_b_bert_base_multilingual_uncased_xx.md b/docs/_posts/ahmedlone127/2024-09-26-trac2020_hin_b_bert_base_multilingual_uncased_xx.md new file mode 100644 index 00000000000000..4f4916a5a1a425 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-trac2020_hin_b_bert_base_multilingual_uncased_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual trac2020_hin_b_bert_base_multilingual_uncased BertForSequenceClassification from socialmediaie +author: John Snow Labs +name: trac2020_hin_b_bert_base_multilingual_uncased +date: 2024-09-26 +tags: [xx, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`trac2020_hin_b_bert_base_multilingual_uncased` is a Multilingual model originally trained by socialmediaie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/trac2020_hin_b_bert_base_multilingual_uncased_xx_5.5.0_3.0_1727341856482.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/trac2020_hin_b_bert_base_multilingual_uncased_xx_5.5.0_3.0_1727341856482.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("trac2020_hin_b_bert_base_multilingual_uncased","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("trac2020_hin_b_bert_base_multilingual_uncased", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|trac2020_hin_b_bert_base_multilingual_uncased| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|627.7 MB| + +## References + +https://huggingface.co/socialmediaie/TRAC2020_HIN_B_bert-base-multilingual-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-trac2020_hin_c_bert_base_multilingual_uncased_xx.md b/docs/_posts/ahmedlone127/2024-09-26-trac2020_hin_c_bert_base_multilingual_uncased_xx.md new file mode 100644 index 00000000000000..7faa238215481c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-trac2020_hin_c_bert_base_multilingual_uncased_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual trac2020_hin_c_bert_base_multilingual_uncased BertForSequenceClassification from socialmediaie +author: John Snow Labs +name: trac2020_hin_c_bert_base_multilingual_uncased +date: 2024-09-26 +tags: [xx, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`trac2020_hin_c_bert_base_multilingual_uncased` is a Multilingual model originally trained by socialmediaie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/trac2020_hin_c_bert_base_multilingual_uncased_xx_5.5.0_3.0_1727311680979.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/trac2020_hin_c_bert_base_multilingual_uncased_xx_5.5.0_3.0_1727311680979.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("trac2020_hin_c_bert_base_multilingual_uncased","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("trac2020_hin_c_bert_base_multilingual_uncased", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|trac2020_hin_c_bert_base_multilingual_uncased| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|627.8 MB| + +## References + +https://huggingface.co/socialmediaie/TRAC2020_HIN_C_bert-base-multilingual-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-trac2020_iben_b_bert_base_multilingual_uncased_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-09-26-trac2020_iben_b_bert_base_multilingual_uncased_pipeline_xx.md new file mode 100644 index 00000000000000..0ef06dbba60ba4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-trac2020_iben_b_bert_base_multilingual_uncased_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual trac2020_iben_b_bert_base_multilingual_uncased_pipeline pipeline BertForSequenceClassification from socialmediaie +author: John Snow Labs +name: trac2020_iben_b_bert_base_multilingual_uncased_pipeline +date: 2024-09-26 +tags: [xx, open_source, pipeline, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`trac2020_iben_b_bert_base_multilingual_uncased_pipeline` is a Multilingual model originally trained by socialmediaie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/trac2020_iben_b_bert_base_multilingual_uncased_pipeline_xx_5.5.0_3.0_1727345471552.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/trac2020_iben_b_bert_base_multilingual_uncased_pipeline_xx_5.5.0_3.0_1727345471552.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("trac2020_iben_b_bert_base_multilingual_uncased_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("trac2020_iben_b_bert_base_multilingual_uncased_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|trac2020_iben_b_bert_base_multilingual_uncased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|627.8 MB| + +## References + +https://huggingface.co/socialmediaie/TRAC2020_IBEN_B_bert-base-multilingual-uncased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-trac2020_iben_b_bert_base_multilingual_uncased_xx.md b/docs/_posts/ahmedlone127/2024-09-26-trac2020_iben_b_bert_base_multilingual_uncased_xx.md new file mode 100644 index 00000000000000..31b496e8f41bbe --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-trac2020_iben_b_bert_base_multilingual_uncased_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual trac2020_iben_b_bert_base_multilingual_uncased BertForSequenceClassification from socialmediaie +author: John Snow Labs +name: trac2020_iben_b_bert_base_multilingual_uncased +date: 2024-09-26 +tags: [xx, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: xx +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`trac2020_iben_b_bert_base_multilingual_uncased` is a Multilingual model originally trained by socialmediaie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/trac2020_iben_b_bert_base_multilingual_uncased_xx_5.5.0_3.0_1727345438747.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/trac2020_iben_b_bert_base_multilingual_uncased_xx_5.5.0_3.0_1727345438747.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("trac2020_iben_b_bert_base_multilingual_uncased","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("trac2020_iben_b_bert_base_multilingual_uncased", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|trac2020_iben_b_bert_base_multilingual_uncased| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|627.7 MB| + +## References + +https://huggingface.co/socialmediaie/TRAC2020_IBEN_B_bert-base-multilingual-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-transaction_categorization_en.md b/docs/_posts/ahmedlone127/2024-09-26-transaction_categorization_en.md new file mode 100644 index 00000000000000..89ff8284c1731b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-transaction_categorization_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English transaction_categorization BertForSequenceClassification from jonjimenez +author: John Snow Labs +name: transaction_categorization +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`transaction_categorization` is a English model originally trained by jonjimenez. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/transaction_categorization_en_5.5.0_3.0_1727352691983.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/transaction_categorization_en_5.5.0_3.0_1727352691983.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("transaction_categorization","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("transaction_categorization", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|transaction_categorization| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jonjimenez/transaction-categorization \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-transaction_categorization_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-transaction_categorization_pipeline_en.md new file mode 100644 index 00000000000000..6deb2b8b3153ff --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-transaction_categorization_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English transaction_categorization_pipeline pipeline BertForSequenceClassification from jonjimenez +author: John Snow Labs +name: transaction_categorization_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`transaction_categorization_pipeline` is a English model originally trained by jonjimenez. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/transaction_categorization_pipeline_en_5.5.0_3.0_1727352712926.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/transaction_categorization_pipeline_en_5.5.0_3.0_1727352712926.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("transaction_categorization_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("transaction_categorization_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|transaction_categorization_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jonjimenez/transaction-categorization + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-tupi_bert_large_portuguese_cased_pipeline_pt.md b/docs/_posts/ahmedlone127/2024-09-26-tupi_bert_large_portuguese_cased_pipeline_pt.md new file mode 100644 index 00000000000000..53cf32175d65c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-tupi_bert_large_portuguese_cased_pipeline_pt.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Portuguese tupi_bert_large_portuguese_cased_pipeline pipeline BertForSequenceClassification from FpOliveira +author: John Snow Labs +name: tupi_bert_large_portuguese_cased_pipeline +date: 2024-09-26 +tags: [pt, open_source, pipeline, onnx] +task: Text Classification +language: pt +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tupi_bert_large_portuguese_cased_pipeline` is a Portuguese model originally trained by FpOliveira. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tupi_bert_large_portuguese_cased_pipeline_pt_5.5.0_3.0_1727344691119.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tupi_bert_large_portuguese_cased_pipeline_pt_5.5.0_3.0_1727344691119.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("tupi_bert_large_portuguese_cased_pipeline", lang = "pt") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("tupi_bert_large_portuguese_cased_pipeline", lang = "pt") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tupi_bert_large_portuguese_cased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|pt| +|Size:|1.3 GB| + +## References + +https://huggingface.co/FpOliveira/tupi-bert-large-portuguese-cased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-tupi_bert_large_portuguese_cased_pt.md b/docs/_posts/ahmedlone127/2024-09-26-tupi_bert_large_portuguese_cased_pt.md new file mode 100644 index 00000000000000..2667a33d36a924 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-tupi_bert_large_portuguese_cased_pt.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Portuguese tupi_bert_large_portuguese_cased BertForSequenceClassification from FpOliveira +author: John Snow Labs +name: tupi_bert_large_portuguese_cased +date: 2024-09-26 +tags: [pt, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: pt +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tupi_bert_large_portuguese_cased` is a Portuguese model originally trained by FpOliveira. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tupi_bert_large_portuguese_cased_pt_5.5.0_3.0_1727344627895.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tupi_bert_large_portuguese_cased_pt_5.5.0_3.0_1727344627895.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("tupi_bert_large_portuguese_cased","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("tupi_bert_large_portuguese_cased", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tupi_bert_large_portuguese_cased| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|pt| +|Size:|1.3 GB| + +## References + +https://huggingface.co/FpOliveira/tupi-bert-large-portuguese-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-uned_tfg_08_58_mas_frecuentes_en.md b/docs/_posts/ahmedlone127/2024-09-26-uned_tfg_08_58_mas_frecuentes_en.md new file mode 100644 index 00000000000000..892d4a6fc77815 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-uned_tfg_08_58_mas_frecuentes_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English uned_tfg_08_58_mas_frecuentes BertForSequenceClassification from alexisdr +author: John Snow Labs +name: uned_tfg_08_58_mas_frecuentes +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`uned_tfg_08_58_mas_frecuentes` is a English model originally trained by alexisdr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/uned_tfg_08_58_mas_frecuentes_en_5.5.0_3.0_1727321642070.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/uned_tfg_08_58_mas_frecuentes_en_5.5.0_3.0_1727321642070.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("uned_tfg_08_58_mas_frecuentes","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("uned_tfg_08_58_mas_frecuentes", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|uned_tfg_08_58_mas_frecuentes| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/alexisdr/uned-tfg-08.58_mas_frecuentes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-uned_tfg_08_58_mas_frecuentes_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-uned_tfg_08_58_mas_frecuentes_pipeline_en.md new file mode 100644 index 00000000000000..ef30571fd38296 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-uned_tfg_08_58_mas_frecuentes_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English uned_tfg_08_58_mas_frecuentes_pipeline pipeline BertForSequenceClassification from alexisdr +author: John Snow Labs +name: uned_tfg_08_58_mas_frecuentes_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`uned_tfg_08_58_mas_frecuentes_pipeline` is a English model originally trained by alexisdr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/uned_tfg_08_58_mas_frecuentes_pipeline_en_5.5.0_3.0_1727321676514.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/uned_tfg_08_58_mas_frecuentes_pipeline_en_5.5.0_3.0_1727321676514.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("uned_tfg_08_58_mas_frecuentes_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("uned_tfg_08_58_mas_frecuentes_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|uned_tfg_08_58_mas_frecuentes_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/alexisdr/uned-tfg-08.58_mas_frecuentes + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-vacc_en.md b/docs/_posts/ahmedlone127/2024-09-26-vacc_en.md new file mode 100644 index 00000000000000..27f0d3af1743a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-vacc_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English vacc BertForSequenceClassification from abigailp +author: John Snow Labs +name: vacc +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`vacc` is a English model originally trained by abigailp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/vacc_en_5.5.0_3.0_1727311237518.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/vacc_en_5.5.0_3.0_1727311237518.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("vacc","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("vacc", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|vacc| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/abigailp/vacc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-vacc_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-vacc_pipeline_en.md new file mode 100644 index 00000000000000..d02f0e13789ed7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-vacc_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English vacc_pipeline pipeline BertForSequenceClassification from abigailp +author: John Snow Labs +name: vacc_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`vacc_pipeline` is a English model originally trained by abigailp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/vacc_pipeline_en_5.5.0_3.0_1727311258350.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/vacc_pipeline_en_5.5.0_3.0_1727311258350.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("vacc_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("vacc_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|vacc_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/abigailp/vacc + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-vetbertdx_en.md b/docs/_posts/ahmedlone127/2024-09-26-vetbertdx_en.md new file mode 100644 index 00000000000000..ed2fc37aaca263 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-vetbertdx_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English vetbertdx BertForSequenceClassification from havocy28 +author: John Snow Labs +name: vetbertdx +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`vetbertdx` is a English model originally trained by havocy28. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/vetbertdx_en_5.5.0_3.0_1727334997813.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/vetbertdx_en_5.5.0_3.0_1727334997813.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("vetbertdx","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("vetbertdx", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|vetbertdx| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.2 MB| + +## References + +https://huggingface.co/havocy28/VetBERTDx \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-yang_grammer_check_pipeline_en.md b/docs/_posts/ahmedlone127/2024-09-26-yang_grammer_check_pipeline_en.md new file mode 100644 index 00000000000000..d201a308e4e4d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-yang_grammer_check_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English yang_grammer_check_pipeline pipeline BertForSequenceClassification from xy4286 +author: John Snow Labs +name: yang_grammer_check_pipeline +date: 2024-09-26 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`yang_grammer_check_pipeline` is a English model originally trained by xy4286. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/yang_grammer_check_pipeline_en_5.5.0_3.0_1727344298679.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/yang_grammer_check_pipeline_en_5.5.0_3.0_1727344298679.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("yang_grammer_check_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("yang_grammer_check_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|yang_grammer_check_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/xy4286/yang-grammer-check + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-09-26-yelp_model_3k_10layer_en.md b/docs/_posts/ahmedlone127/2024-09-26-yelp_model_3k_10layer_en.md new file mode 100644 index 00000000000000..46d8cf91144161 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-09-26-yelp_model_3k_10layer_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English yelp_model_3k_10layer BertForSequenceClassification from mogmyij +author: John Snow Labs +name: yelp_model_3k_10layer +date: 2024-09-26 +tags: [en, open_source, onnx, sequence_classification, bert] +task: Text Classification +language: en +edition: Spark NLP 5.5.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`yelp_model_3k_10layer` is a English model originally trained by mogmyij. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/yelp_model_3k_10layer_en_5.5.0_3.0_1727353411460.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/yelp_model_3k_10layer_en_5.5.0_3.0_1727353411460.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification.pretrained("yelp_model_3k_10layer","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("yelp_model_3k_10layer", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|yelp_model_3k_10layer| +|Compatibility:|Spark NLP 5.5.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.2 MB| + +## References + +https://huggingface.co/mogmyij/yelp-model-3k-10layer \ No newline at end of file