diff --git a/docs/_posts/ahmedlone127/2024-02-16-distil_asr_whisper_small_en.md b/docs/_posts/ahmedlone127/2024-02-16-distil_asr_whisper_small_en.md new file mode 100644 index 00000000000000..feb8c9a68a4a2d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-02-16-distil_asr_whisper_small_en.md @@ -0,0 +1,92 @@ +--- +layout: model +title: English distil_asr_whisper_small WhisperForCTC from distil-whisper +author: John Snow Labs +name: distil_asr_whisper_small +date: 2024-02-16 +tags: [en, open_source, onnx] +task: Automatic Speech Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.0 +supported: true +engine: onnx +annotator: WhisperForCTC +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.distil_asr_whisper_small is a English model originally trained by distil-whisper. + +This model is only compatible with PySpark 3.4 and above + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distil_asr_whisper_small_en_5.2.4_3.0_1708118638184.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distil_asr_whisper_small_en_5.2.4_3.0_1708118638184.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +audioAssembler = AudioAssembler() \ + .setInputCol("audio_content") \ + .setOutputCol("audio_assembler") + + +speechToText = WhisperForCTC.pretrained("distil_asr_whisper_small","en") \ + .setInputCols(["audio_assembler"]) \ + .setOutputCol("text") + +pipeline = Pipeline().setStages([audioAssembler, speechToText]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) +``` +```scala +val audioAssembler = new AudioAssembler() + .setInputCol("audio_content") + .setOutputCol("audio_assembler") + +val speechToText = WhisperForCTC.pretrained("distil_asr_whisper_small","en") + .setInputCols(Array("audio_assembler")) + .setOutputCol("text") + +val pipeline = new Pipeline().setStages(Array(audioAssembler, speechToText)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distil_asr_whisper_small| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[audio_assembler]| +|Output Labels:|[text]| +|Language:|en| +|Size:|748.5 MB| + +## References + +https://huggingface.co/distil-whisper/distil-small.en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-02-25-distil_asr_whisper_medium_en.md b/docs/_posts/ahmedlone127/2024-02-25-distil_asr_whisper_medium_en.md new file mode 100644 index 00000000000000..ed115611e6e969 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-02-25-distil_asr_whisper_medium_en.md @@ -0,0 +1,89 @@ +--- +layout: model +title: English distil_asr_whisper_mediumWhisperForCTC from distil-whisper +author: John Snow Labs +name: distil_asr_whisper_medium +date: 2024-02-25 +tags: [whisper, en, open_source, onnx] +task: Automatic Speech Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: WhisperForCTC +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.distil_asr_whisper_medium is a English model originally trained by distil-whisper. + +This model is only compatible with PySpark 3.4 and above + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distil_asr_whisper_medium_en_5.2.4_3.4_1708901703317.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distil_asr_whisper_medium_en_5.2.4_3.4_1708901703317.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +audioAssembler = AudioAssembler() \ + .setInputCol("audio_content") \ + .setOutputCol("audio_assembler") + + +speechToText = WhisperForCTC.pretrained("distil_asr_whisper_medium","en") \ + .setInputCols(["audio_assembler"]) \ + .setOutputCol("text") + +pipeline = Pipeline().setStages([audioAssembler, speechToText]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) +``` +```scala +val audioAssembler = new AudioAssembler() + .setInputCol("audio_content") + .setOutputCol("audio_assembler") + +val speechToText = WhisperForCTC.pretrained("distil_asr_whisper_medium","en") + .setInputCols(Array("audio_assembler")) + .setOutputCol("text") +val pipeline = new Pipeline().setStages(Array(audioAssembler, speechToText)) +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distil_asr_whisper_medium| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[audio_assembler]| +|Output Labels:|[text]| +|Language:|en| +|Size:|1.4 GB| + +## References + +https://huggingface.co/distil-whisper/distil-medium.en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-02-26-distil_asr_whisper_large_v2_en.md b/docs/_posts/ahmedlone127/2024-02-26-distil_asr_whisper_large_v2_en.md new file mode 100644 index 00000000000000..4f0d0343d308d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-02-26-distil_asr_whisper_large_v2_en.md @@ -0,0 +1,88 @@ +--- +layout: model +title: English distil_asr_whisper_large_v2 WhisperForCTC from distil-whisper +author: John Snow Labs +name: distil_asr_whisper_large_v2 +date: 2024-02-26 +tags: [en, open_source, onnx] +task: Automatic Speech Recognition +language: en +edition: Spark NLP 5.2.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: WhisperForCTC +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained WhisperForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.distil_asr_whisper_large_v2 is a English model originally trained by distil-whisper. + +This model is only compatible with PySpark 3.4 and above + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distil_asr_whisper_large_v2_en_5.2.4_3.4_1708969018025.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distil_asr_whisper_large_v2_en_5.2.4_3.4_1708969018025.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +audioAssembler = AudioAssembler() \ + .setInputCol("audio_content") \ + .setOutputCol("audio_assembler") + + +speechToText = WhisperForCTC.pretrained("distil_asr_whisper_large_v2","en") \ + .setInputCols(["audio_assembler"]) \ + .setOutputCol("text") + +pipeline = Pipeline().setStages([audioAssembler, speechToText]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) +``` +```scala +val audioAssembler = new AudioAssembler() + .setInputCol("audio_content") + .setOutputCol("audio_assembler") + +val speechToText = WhisperForCTC.pretrained("distil_asr_whisper_large_v2","en") + .setInputCols(Array("audio_assembler")) + .setOutputCol("text") +val pipeline = new Pipeline().setStages(Array(audioAssembler, speechToText)) +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distil_asr_whisper_large_v2| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[audio_assembler]| +|Output Labels:|[text]| +|Language:|en| +|Size:|2.4 GB| + +## References + +https://huggingface.co/distil-whisper/distil-large-v2 \ No newline at end of file