diff --git a/docs/_posts/ahmedlone127/2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_allnli_tr.md b/docs/_posts/ahmedlone127/2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_allnli_tr.md index 6d378879613f22..a9f97daf26b748 100644 --- a/docs/_posts/ahmedlone127/2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_allnli_tr.md +++ b/docs/_posts/ahmedlone127/2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_allnli_tr.md @@ -4,7 +4,7 @@ title: DistilBERTZero-Shot Classification Base - distilbert_base_zero_shot_class author: John Snow Labs name: distilbert_base_zero_shot_classifier_turkish_cased_allnli date: 2023-04-20 -tags: [zero_shot, distilbert, base, tr, turkish, cased, open_source, tensorflow] +tags: [distilbert, zero_shot, turkish, tr, base, open_source, tensorflow] task: Zero-Shot Classification language: tr edition: Spark NLP 4.4.1 @@ -32,8 +32,8 @@ We used TFDistilBertForSequenceClassification to train this model and used Disti {:.btn-box} -[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_allnli_4.4.1_3.2_1681950583033.zip){:.button.button-orange} -[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_allnli_tr_4.4.1_3.2_1681950583033.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_allnli_tr_4.4.1_3.2_1682016415236.zip){:.button.button-orange} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_allnli_tr_4.4.1_3.2_1682016415236.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} ## How to use @@ -63,7 +63,6 @@ document_assembler, tokenizer, zeroShotClassifier ]) - example = spark.createDataFrame([['Senaryo çok saçmaydı, beğendim diyemem.']]).toDF("text") result = pipeline.fit(example).transform(example) ``` @@ -84,9 +83,7 @@ val zeroShotClassifier = DistilBertForZeroShotClassification.pretrained("distilb .setCandidateLabels(Array("olumsuz", "olumlu")) val pipeline = new Pipeline().setStages(Array(document_assembler, tokenizer, zeroShotClassifier)) - val example = Seq("Senaryo çok saçmaydı, beğendim diyemem.").toDS.toDF("text") - val result = pipeline.fit(example).transform(example) ``` @@ -104,4 +101,4 @@ val result = pipeline.fit(example).transform(example) |Output Labels:|[multi_class]| |Language:|tr| |Size:|254.3 MB| -|Case sensitive:|true| +|Case sensitive:|true| \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_multinli_tr.md b/docs/_posts/ahmedlone127/2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_multinli_tr.md index eb05ea476bc5a4..2395f728406d60 100644 --- a/docs/_posts/ahmedlone127/2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_multinli_tr.md +++ b/docs/_posts/ahmedlone127/2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_multinli_tr.md @@ -32,8 +32,8 @@ We used TFDistilBertForSequenceClassification to train this model and used Disti {:.btn-box} -[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_multinli_tr_4.4.1_3.2_1681952299918.zip){:.button.button-orange} -[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_multinli_tr_4.4.1_3.2_1681952299918.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_multinli_tr_4.4.1_3.2_1682014879417.zip){:.button.button-orange} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_multinli_tr_4.4.1_3.2_1682014879417.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} ## How to use @@ -45,7 +45,6 @@ We used TFDistilBertForSequenceClassification to train this model and used Disti document_assembler = DocumentAssembler() \ .setInputCol('text') \ .setOutputCol('document') - tokenizer = Tokenizer() \ .setInputCols(['document']) \ .setOutputCol('token') @@ -63,10 +62,8 @@ document_assembler, tokenizer, zeroShotClassifier ]) - example = spark.createDataFrame([['Dolar yükselmeye devam ediyor.']]).toDF("text") result = pipeline.fit(example).transform(example) - ``` ```scala val document_assembler = DocumentAssembler() @@ -85,9 +82,7 @@ val zeroShotClassifier = DistilBertForZeroShotClassification.pretrained("distilb .setCandidateLabels(Array("ekonomi", "siyaset","spor")) val pipeline = new Pipeline().setStages(Array(document_assembler, tokenizer, zeroShotClassifier)) - val example = Seq("Dolar yükselmeye devam ediyor.").toDS.toDF("text") - val result = pipeline.fit(example).transform(example) ``` diff --git a/docs/_posts/ahmedlone127/2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_snli_tr.md b/docs/_posts/ahmedlone127/2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_snli_tr.md index 63840286509e53..4e98ec4735f69a 100644 --- a/docs/_posts/ahmedlone127/2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_snli_tr.md +++ b/docs/_posts/ahmedlone127/2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_snli_tr.md @@ -32,8 +32,8 @@ We used TFDistilBertForSequenceClassification to train this model and used Disti {:.btn-box} -[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_snli_tr_4.4.1_3.2_1681951486863.zip){:.button.button-orange} -[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_snli_tr_4.4.1_3.2_1681951486863.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_snli_tr_4.4.1_3.2_1682015986268.zip){:.button.button-orange} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_snli_tr_4.4.1_3.2_1682015986268.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} ## How to use @@ -63,7 +63,6 @@ document_assembler, tokenizer, zeroShotClassifier ]) - example = spark.createDataFrame([['Senaryo çok saçmaydı, beğendim diyemem.']]).toDF("text") result = pipeline.fit(example).transform(example) ``` @@ -75,8 +74,9 @@ val document_assembler = DocumentAssembler() val tokenizer = Tokenizer() .setInputCols("document") .setOutputCol("token") +val zeroShotClassifier = -val zeroShotClassifier = DistilBertForZeroShotClassification.pretrained("distilbert_base_zero_shot_classifier_turkish_cased_snli", "en") +DistilBertForZeroShotClassification.pretrained("distilbert_base_zero_shot_classifier_turkish_cased_snli", "en") .setInputCols("document", "token") .setOutputCol("class") .setCaseSensitive(true) @@ -84,9 +84,7 @@ val zeroShotClassifier = DistilBertForZeroShotClassification.pretrained("distilb .setCandidateLabels(Array("olumsuz", "olumlu")) val pipeline = new Pipeline().setStages(Array(document_assembler, tokenizer, zeroShotClassifier)) - val example = Seq("Senaryo çok saçmaydı, beğendim diyemem.").toDS.toDF("text") - val result = pipeline.fit(example).transform(example) ``` diff --git a/docs/_posts/ahmedlone127/2023-04-20-distilbert_base_zero_shot_classifier_uncased_mnli_en.md b/docs/_posts/ahmedlone127/2023-04-20-distilbert_base_zero_shot_classifier_uncased_mnli_en.md new file mode 100644 index 00000000000000..5caaaf1e7bdd2c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-04-20-distilbert_base_zero_shot_classifier_uncased_mnli_en.md @@ -0,0 +1,105 @@ +--- +layout: model +title: DistilBERTZero-Shot Classification Base - MNLI(distilbert_base_zero_shot_classifier_uncased_mnli) +author: John Snow Labs +name: distilbert_base_zero_shot_classifier_uncased_mnli +date: 2023-04-20 +tags: [zero_shot, en, mnli, distilbert, english, base, open_source, tensorflow] +task: Zero-Shot Classification +language: en +edition: Spark NLP 4.4.1 +spark_version: [3.2, 3.0] +supported: true +engine: tensorflow +annotator: DistilBertForZeroShotClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +This model is intended to be used for zero-shot text classification, especially in English. It is fine-tuned on MNLI by using DistilBERT Base Uncased model. + +DistilBertForZeroShotClassification using a ModelForSequenceClassification trained on NLI (natural language inference) tasks. Equivalent of DistilBertForSequenceClassification models, but these models don’t require a hardcoded number of potential classes, they can be chosen at runtime. It usually means it’s slower but it is much more flexible. + +We used TFDistilBertForSequenceClassification to train this model and used DistilBertForZeroShotClassification annotator in Spark NLP 🚀 for prediction at scale! + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_uncased_mnli_en_4.4.1_3.2_1682015669457.zip){:.button.button-orange} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_uncased_mnli_en_4.4.1_3.2_1682015669457.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +