Skip to content

Commit

Permalink
2024-09-09-distilbert_extractive_qa_large_project_pipeline_en (#14397)
Browse files Browse the repository at this point in the history
* Add model 2024-09-09-dock_1_en

* Add model 2024-09-09-floor_model_en

* Add model 2024-09-08-burmese_awesome_qa_model_zhandsome_pipeline_en

* Add model 2024-09-08-distilbert_base_uncased_finetuned_imdb_accelerate_minshengchan_pipeline_en

* Add model 2024-09-08-multi_qa_mpnet_base_dot_v1_deneme_pipeline_en

* Add model 2024-09-07-model_perturbations_en

* Add model 2024-09-10-distilbertfinetunehsfifteenepoch_en

* Add model 2024-09-07-codebert_small_v2_en

* Add model 2024-09-08-qa_pipeline_en

* Add model 2024-09-09-burmese_awesome_qa_model_acezxn_pipeline_en

* Add model 2024-09-09-all_roberta_large_v1_work_4_16_5_pipeline_en

* Add model 2024-09-08-translation_not_evaluated_en

* Add model 2024-09-09-xlm_roberta_base_finetuned_panx_german_qihehehehe_pipeline_en

* Add model 2024-09-07-dummy_model_sidd_2203_pipeline_en

* Add model 2024-09-07-distilbert_base_uncased_finetuned_ner_seanlee7_pipeline_en

* Add model 2024-09-09-marian_finetuned_kde4_english_tonga_tonga_islands_french_fah_d_pipeline_en

* Add model 2024-09-09-dummy_model_jdang_pipeline_en

* Add model 2024-09-08-marianmt_many2eng_leb_pipeline_en

* Add model 2024-09-09-opus_maltese_walloon_english_finetuned_npomo_english_5_epochs_pipeline_en

* Add model 2024-09-09-testest_ar

* Add model 2024-09-09-roberta_legal_german_cased_german_legal_squad_part_augmented_1000_de

* Add model 2024-09-07-testing_en

* Add model 2024-09-09-malicious_url_detection_en

* Add model 2024-09-10-masked_language_model_nikhilwani_en

* Add model 2024-09-10-distilbert_base_uncased_finetuned_imdb_vasaicrow_pipeline_en

* Add model 2024-09-10-lettuce_sayula_popoluca_dutch_xlm_pipeline_en

* Add model 2024-09-07-sent_albert_persian_farsi_zwnj_base_v2_pipeline_fa

* Add model 2024-09-09-translation_english_lug_v4_pipeline_en

* Add model 2024-09-10-whisper_small_arabic_raghadalghonaim_ar

* Add model 2024-09-05-roberta_large_bne_telugu_pipeline_es

* Add model 2024-09-09-otis_official_spam_model_en

* Add model 2024-09-10-dummy_model_sayakadak24_pipeline_en

* Add model 2024-09-10-training_v2_pipeline_ru

* Add model 2024-09-07-action_policy_plans_classifier_pipeline_en

* Add model 2024-09-10-fine_tune_whisper_kagglex_pipeline_hi

* Add model 2024-09-08-early_readmission_deberta_pipeline_en

* Add model 2024-09-07-bert_base_cased_ner_conll2003_en

* Add model 2024-09-09-whispasr_pipeline_hi

* Add model 2024-09-07-burmese_awesome_qa_model_ravinderbrai_pipeline_en

* Add model 2024-09-09-dasvny_pipeline_en

* Add model 2024-09-08-iwslt17_marian_small_ctx4_cwd3_english_french_pipeline_en

* Add model 2024-09-09-cold_fusion_itr22_seed4_en

* Add model 2024-09-09-prev_lab1_finetuning_en

* Add model 2024-09-08-squad_qa_model_portuguese_en

* Add model 2024-09-10-whisper_small_hindi_tortoise17_hi

* Add model 2024-09-07-model_perturbations_pipeline_en

* Add model 2024-09-03-bge_base_securiti_dataset_1_v19_pipeline_en

* Add model 2024-09-08-query_only_5_pipeline_en

* Add model 2024-09-07-burmese_pii_model_pipeline_en

* Add model 2024-09-10-whisper_small_hindi_tortoise17_pipeline_hi

* Add model 2024-09-08-marian_finetuned_kde4_english_tonga_tonga_islands_german_accelerate_translator_nlp_course_chapter7_section3_pipeline_en

* Add model 2024-09-08-clasificadorcorreosoportedistilespanol_en

* Add model 2024-09-05-test_airbus_year_report_en

* Add model 2024-09-09-medical_pubmed_8_17_pipeline_en

* Add model 2024-09-07-ct_kld_xlmr_idkmrc_1_en

* Add model 2024-09-09-routing_module_action_question_conversation_move_hack_debertav3_cls_pipeline_en

* Add model 2024-09-08-dummy_model_itsramyah_en

* Add model 2024-09-09-xlm_roberta_base_finetuned_panx_german_solvaysphere_en

* Add model 2024-09-09-covid_tweet_sentiment_analyzer_roberta_en

* Add model 2024-09-08-distilbert_base_uncased_finetuned_ner_coffee3699_en

* Add model 2024-09-08-distilbert_base_uncased_finetuned_emotion_lilvoda_pipeline_en

* Add model 2024-09-08-deberta_v3_large_survey_cross_passage_consistency_rater_half_gpt4_pipeline_en

* Add model 2024-09-10-bert_base_cased_squad_v1_1_portuguese_v1_1_9_en

* Add model 2024-09-10-histbert_finetuned_ner_en

* Add model 2024-09-10-coha1960s_pipeline_en

* Add model 2024-09-09-mpnet_qa_en

* Add model 2024-09-07-ct_kld_xlmr_idkmrc_1_pipeline_en

* Add model 2024-09-08-inde_4_pipeline_en

* Add model 2024-09-04-burmese_awesome_roberta_model_en

* Add model 2024-09-09-burmese_awesome_qa_model_10_en

* Add model 2024-09-07-setfit_model_misinformation_on_organizations_gofundme_wef_en

* Add model 2024-09-09-q2e_ep3_35_en

* Add model 2024-09-08-deberta_v3_large_survey_main_passage_consistency_rater_all_gpt4_en

* Add model 2024-09-10-distilbert_word2vec_256k_mlm_500k_en

* Add model 2024-09-10-whisper_speech_small_en

* Add model 2024-09-07-sent_financialbert_pipeline_en

* Add model 2024-09-07-taiyi_roberta_124m_d_pipeline_en

* Add model 2024-09-10-whisper_speech_small_pipeline_en

* Add model 2024-09-08-bleurt_base_128_pipeline_en

* Add model 2024-09-10-distilbert_base_uncased_finetuned_squad_jstotz64_en

* Add model 2024-09-10-distilbert_base_uncased_finetuned_imdb_yaojingguo_pipeline_en

* Add model 2024-09-10-distilbert_base_uncased_finetuned_imdb_ssv273_pipeline_en

* Add model 2024-09-10-msmarco_distilbert_word2vec256k_mlm_230k_pipeline_en

* Add model 2024-09-10-distilbert_qa_mysquadv2_8Jan22_finetuned_squad_en

* Add model 2024-09-09-xlm_roberta_base_finetuned_panx_german_italian_pipeline_en

* Add model 2024-09-06-distilbert_base_uncased_finetuned_imdb_kiwihead15_en

* Add model 2024-09-08-distilbert_base_uncased_finetuned_imdb_wwm_pipeline_en

* Add model 2024-09-10-bert_base_uncased_ftd_on_glue_qqp_iter_1_en

* Add model 2024-09-09-roberta_qa_fpdm_soup_model_squad2.0_pipeline_en

* Add model 2024-09-06-burmese_awesome_qa_model_walter133_en

* Add model 2024-09-09-roberta_base_finetuned_hotpot_qa_en

* Add model 2024-09-10-bert_classifier_autonlp_cat333_624217911_zh

* Add model 2024-09-07-cuad_distil_governing_law_08_25_v1_en

* Add model 2024-09-10-bert_base_banking77_pt2_sharmax_vikas_pipeline_en

* Add model 2024-09-10-gdpr_consent_agreement_en

* Add model 2024-09-06-opus_maltese_russian_english_end_tonga_tonga_islands_end_russian_tonga_tonga_islands_english_pipeline_en

* Add model 2024-09-10-e2e_deployment_en

* Add model 2024-09-08-opus_base_lsp_aon_wce_pipeline_en

* Add model 2024-09-10-fintwitbert_sentiment_stephanakkerman_en

* Add model 2024-09-10-lm6_movie_aspect_extraction_bert_en

* Add model 2024-09-10-icd_10_code_prediction_en

* Add model 2024-09-10-bert_base_turkish_sentiment_analysis_tr

* Add model 2024-09-08-from_classifier_v0_en

* Add model 2024-09-08-token_classification_model_vishnun0027_en

* Add model 2024-09-08-lin_camembert_base_en

* Add model 2024-09-04-finer_distillbert_v2_pipeline_en

* Add model 2024-09-04-dummy_model_dvd005_en

* Add model 2024-09-07-results_elseif02_en

* Add model 2024-09-06-burmese_awesome_wnut_model_studentmsd1_en

* Add model 2024-09-08-q2e_333_en

* Add model 2024-09-06-qa_synth_data_with_unanswerable_23_aug_xlm_roberta_base_pipeline_en

* Add model 2024-09-10-fine_tuned_qas_squad_2_with_roberta_large_en

* Add model 2024-09-10-roberta_updated_model_02b_pipeline_en

* Add model 2024-09-10-roberta_finetuned_qa_pipeline_en

* Add model 2024-09-10-climatebert_finetuned_qa_policy_long_en

* Add model 2024-09-07-opus_maltese_english_romanian_finetuned_english_tonga_tonga_islands_romanian_likhith231_en

* Add model 2024-09-10-roberta_base_finetuned_squad_v2_hcy5561_pipeline_en

* Add model 2024-09-10-robertita_cased_finetuned_squad_en

* Add model 2024-09-07-burmese_awesome_wnut_model_carlonos_pipeline_en

* Add model 2024-09-10-roberta_base_finetuned_squad_nlp_course_chapter7_section6_pipeline_en

* Add model 2024-09-08-hindi_roberta_ner_pipeline_en

* Add model 2024-09-10-squad_clip_text_3_pipeline_en

* Add model 2024-09-10-roberta_finetuned_medquad_3_pipeline_en

* Add model 2024-09-10-output_mask_step_pretraining_plus_contr_roberta_large_epochs_1_en

* Add model 2024-09-10-whisper_small_hindi_harshitjoshi_pipeline_hi

* Add model 2024-09-10-output_mask_step_pretraining_plus_contr_roberta_large_epochs_1_pipeline_en

* Add model 2024-09-10-finetuned_model_kunalmod_en

* Add model 2024-09-09-electra_classifier_bertic_tweetsentiment_pipeline_xx

* Add model 2024-09-06-rotten_tomatoes_microsoft_deberta_v3_base_seed_2_pipeline_en

* Add model 2024-09-09-roberta_ner_polygot_MT4TS_pipeline_en

* Add model 2024-09-09-babyberta_wikipedia_french_aochildes_french_without_masking_seed6_finetuned_squad_pipeline_en

* Add model 2024-09-09-distilbert_base_uncased_finetuned_imdb_vonewman_en

* Add model 2024-09-07-julibert_pipeline_ca

* Add model 2024-09-06-xlm_roberta_qa_Part_2_XLM_Model_E1_pipeline_en

* Add model 2024-09-08-lab2_8bit_adam_reshphil_pipeline_en

* Add model 2024-09-09-distilbert_base_uncased_finetuned_pubmed_torch_trained_tabbas97_en

* Add model 2024-09-10-cuad_distil_document_name_cased_08_31_v1_en

* Add model 2024-09-10-squad_qa_model_jamesmcmill_pipeline_en

* Add model 2024-09-10-burmese_awesome_qa_model_sachinsharma0325_en

* Add model 2024-09-07-xlm_roberta_base_finetuned_panx_german_french_transformersbook_en

* Add model 2024-09-10-jai_shri_ram_finetuned_squad_en

* Add model 2024-09-10-bert_base_qa_model_7up_en

* Add model 2024-09-10-burmese_awesome_qa_model_fsghs_pipeline_en

* Add model 2024-09-10-burmese_awesome_qa_model_fsghs_en

* Add model 2024-09-10-burmese_awesome_qa_model_jleung1618_en

* Add model 2024-09-10-distilbert_base_uncased_finetuned_squad_orgilj_en

* Add model 2024-09-10-distilbert_base_uncased_finetuned_squad_superlazycoder_en

* Add model 2024-09-08-autotrain_event_en

* Add model 2024-09-08-all_mpnet_base_newtriplets_v2_lr_1e_8_m_5_e_3_pipeline_en

* Add model 2024-09-10-distilbertfinetunehsfifteenepoch_pipeline_en

* Add model 2024-09-08-distilbert_base_cased_distilbert_pipeline_en

* Add model 2024-09-09-opus_maltese_english_chinese_pipeline_en

* Add model 2024-09-09-opus_maltese_english_romanian_finetuned_english_tonga_tonga_islands_romanian_hfdsajkfd_en

* Add model 2024-09-09-test_model_tianyi_zhang_pipeline_en

* Add model 2024-09-08-deberta_v3_large_survey_topicality_rater_half_gpt4_pipeline_en

* Add model 2024-09-07-nreimers_minilmv2_l6_h384_distilled_from_roberta_large_pipeline_en

* Add model 2024-09-09-xlm_roberta_base_finetuned_panx_german_french_inniok_en

* Add model 2024-09-06-distilbert_base_uncased_finetuned_squad_d5716d28_ahmed97_en

* Add model 2024-09-09-roberta_base_meta_tuning_test_en

* Add model 2024-09-03-opus_maltese_thai_english_finetuned_english_tonga_tonga_islands_thai_theetawat_en

* Add model 2024-09-09-all_mpnet_base_v2_kunwooshin_pipeline_en

* Add model 2024-09-08-xlm_roberta_base_finetuned_panx_english_cyycyy_en

* Add model 2024-09-10-msmarco_distilbert_word2vec256k_mlm_230k_en

* Add model 2024-09-09-petbert_pipeline_en

* Add model 2024-09-10-bert_finetuned_squadv2_en

* Add model 2024-09-09-bert_southern_sotho_qa_multi_qa_mpnet_base_cos_v1_epochs_10_en

* Add model 2024-09-09-distilroberta_financial_sentiment_model_5000_samples_fine_tune_en

* Add model 2024-09-09-opus_maltese_english_bkm_10e10encdec_pipeline_en

* Add model 2024-09-08-output_en

* Add model 2024-09-10-rmse_4_en

* Add model 2024-09-10-cros_2_en

* Add model 2024-09-10-roberta_base_imdb_trained_en

* Add model 2024-09-08-xlm_roberta_base_sentiment_multilingual_xx

* Add model 2024-09-10-financebert_en

* Add model 2024-09-10-covid_vaccine_sentiment_analysis_roberta_model_newtonkimathi_en

* Add model 2024-09-10-ddi_pipeline_en

* Add model 2024-09-10-bert_base_kununua_model_pipeline_en

* Add model 2024-09-10-dock_4_en

* Add model 2024-09-10-hate_detector_en

* Add model 2024-09-10-roberta_base_roberta_model_enyonam_en

* Add model 2024-09-10-sentiment_bert_large_e8_b16_en

* Add model 2024-09-07-anus_wanus_panus_ranus_en

* Add model 2024-09-10-sentiment_bert_large_e8_b16_pipeline_en

* Add model 2024-09-10-roberta_finetuned_vitaminc_50k_en

* Add model 2024-09-09-finetuning_emotion_model_5_v2_pipeline_en

* Add model 2024-09-09-electra_classifier_bertic_tweetsentiment_xx

* Add model 2024-09-08-marabert22_model_pipeline_en

* Add model 2024-09-10-bsc_bio_ehr_spanish_symptemist_word2vec_75_ner_pipeline_en

* Add model 2024-09-08-facets_ep3_35_en

* Add model 2024-09-09-nerd_nerd_random3_seed2_bernice_en

* Add model 2024-09-10-best_64_shots_pipeline_en

* Add model 2024-09-10-all_mpnet_base_v2_fine_tuned_epochs_1_binhcode25_en

* Add model 2024-09-10-parameter_psb_en

* Add model 2024-09-09-bertovosentneg2_en

* Add model 2024-09-10-roberta_finetuned_subjqa_movies_2_ethegem_pipeline_en

* Add model 2024-09-10-headline_similarities_en

* Add model 2024-09-10-burmese_setfit_pipeline_en

* Add model 2024-09-10-burmese_setfit_en

* Add model 2024-09-10-babyberta_aochildes_french_wikipedia_french_without_masking_seed6_finetuned_squad_en

* Add model 2024-09-10-cot_ep3_35_en

* Add model 2024-09-10-southern_sotho_all_mpnet_finetuned_comb_1500_pipeline_en

* Add model 2024-09-10-multi_sbert_v2_pipeline_en

* Add model 2024-09-10-q2d_ep3_1234_en

* Add model 2024-09-10-all_mpnet_128_20_mnsr_base_en

* Add model 2024-09-10-mpnet_frozen_newtriplets_lr_2e_7_m_1_e_5_en

* Add model 2024-09-09-bent_pubmedbert_ner_cell_type_en

* Add model 2024-09-10-mpnet_frozen_newtriplets_lr_2e_7_m_1_e_5_pipeline_en

* Add model 2024-09-10-results_teng0929_en

* Add model 2024-09-09-emoji_emoji_random1_seed2_twitter_roberta_base_2022_154m_en

* Add model 2024-09-09-jerteh_355_sr

* Add model 2024-09-09-augment_tweet_bert_large_e4_pipeline_en

* Add model 2024-09-10-2020_q3_50p_filtered_random_en

* Add model 2024-09-09-patentbert_pipeline_en

* Add model 2024-09-06-delivery_balaned_distilbert_base_uncased_v3_pipeline_en

* Add model 2024-09-10-twitter_roberta_base_dec2021_emotion_pipeline_en

* Add model 2024-09-10-all_roberta_large_v1_meta_6_16_5_pipeline_en

* Add model 2024-09-09-robertamodel_en

* Add model 2024-09-04-deberta_v3_base_finetuned_mcqa_michaellutz_en

* Add model 2024-09-10-whisper_small_eg_en

* Add model 2024-09-10-dock_0_pipeline_en

* Add model 2024-09-09-discord_twitter_distilbert_en

* Add model 2024-09-10-dock_0_en

* Add model 2024-09-10-nlp_team_binarytoxicityclassifierforevaluationpurpose_en

* Add model 2024-09-10-2020_q2_90p_filtered_random_en

* Add model 2024-09-08-burmese_awesome_wnut_model_saidileep1007_en

* Add model 2024-09-10-roberta_reman_tec_pipeline_en

* Add model 2024-09-10-regr_3_en

* Add model 2024-09-10-roberta_stance_compqa_pipeline_en

* Add model 2024-09-10-distilbert_base_uncased_finetuned_squad_edw144_pipeline_en

* Add model 2024-09-08-xtremedistil_l12_h384_uncased_pipeline_en

* Add model 2024-09-10-nace2_level1_29_en

* Add model 2024-09-10-nace2_level1_29_pipeline_en

* Add model 2024-09-10-autonlp_predict_roi_1_29797730_en

* Add model 2024-09-10-fine_tuning_nlp_en

* Add model 2024-09-10-platzi_distilroberta_base_mrpc_glue_santirest_pipeline_en

* Add model 2024-09-09-xlm_roberta_base_finetuned_marc_begar_en

* Add model 2024-09-08-covid_tweet_sentiment_analysis_roberta_model_en

* Add model 2024-09-10-english_astitchtask1a_robertabase_falsetrue_0_0_best_en

* Add model 2024-09-10-twiiter_try8_fold0_en

* Add model 2024-09-10-hate_hate_balance_random0_seed1_twitter_roberta_base_dec2020_pipeline_en

* Add model 2024-09-08-embed_andegpt_h768_es

* Add model 2024-09-07-distilbert_base_uncased_finetuned_emotion_sunwoongee_en

* Add model 2024-09-08-sent_bert_base_uncased_google_bert_en

* Add model 2024-09-06-bert_base_german_dbmdz_cased_de

* Add model 2024-09-09-bidirection_translate_model_fixed_v0_4_en

* Add model 2024-09-09-opus_maltese_english_romanian_finetuned_english_tonga_tonga_islands_romanian_dlyfar_en

* Add model 2024-09-09-clas_4_pipeline_en

* Add model 2024-09-07-distilbert_base_uncased_finetuned_squad_meline_en

* Add model 2024-09-09-helsinki_danish_swedish_v10_en

* Add model 2024-09-08-babyberta_aochildes_2_5m_aochildes_french_without_masking_seed6_finetuned_squad_pipeline_en

---------

Co-authored-by: ahmedlone127 <ahmedlone127@gmail.com>
  • Loading branch information
jsl-models and ahmedlone127 authored Sep 10, 2024
1 parent 2df9cfd commit 6eb7df9
Show file tree
Hide file tree
Showing 1,259 changed files with 102,686 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
---
layout: model
title: English bge_base_securiti_dataset_1_v19_pipeline pipeline BGEEmbeddings from MugheesAwan11
author: John Snow Labs
name: bge_base_securiti_dataset_1_v19_pipeline
date: 2024-09-03
tags: [en, open_source, pipeline, onnx]
task: Embeddings
language: en
edition: Spark NLP 5.5.0
spark_version: 3.0
supported: true
annotator: PipelineModel
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained BGEEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bge_base_securiti_dataset_1_v19_pipeline` is a English model originally trained by MugheesAwan11.

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bge_base_securiti_dataset_1_v19_pipeline_en_5.5.0_3.0_1725357366801.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bge_base_securiti_dataset_1_v19_pipeline_en_5.5.0_3.0_1725357366801.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python

pipeline = PretrainedPipeline("bge_base_securiti_dataset_1_v19_pipeline", lang = "en")
annotations = pipeline.transform(df)

```
```scala

val pipeline = new PretrainedPipeline("bge_base_securiti_dataset_1_v19_pipeline", lang = "en")
val annotations = pipeline.transform(df)

```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|bge_base_securiti_dataset_1_v19_pipeline|
|Type:|pipeline|
|Compatibility:|Spark NLP 5.5.0+|
|License:|Open Source|
|Edition:|Official|
|Language:|en|
|Size:|381.4 MB|

## References

https://huggingface.co/MugheesAwan11/bge-base-securiti-dataset-1-v19

## Included Models

- DocumentAssembler
- BGEEmbeddings
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
---
layout: model
title: English burmese_translation_helsinki MarianTransformer from duwuonline
author: John Snow Labs
name: burmese_translation_helsinki
date: 2024-09-03
tags: [en, open_source, onnx, translation, marian]
task: Translation
language: en
edition: Spark NLP 5.5.0
spark_version: 3.0
supported: true
engine: onnx
annotator: MarianTransformer
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained MarianTransformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_translation_helsinki` is a English model originally trained by duwuonline.

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_translation_helsinki_en_5.5.0_3.0_1725345497152.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_translation_helsinki_en_5.5.0_3.0_1725345497152.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python

documentAssembler = DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("document")

sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \
.setInputCols(["document"]) \
.setOutputCol("translation")

marian = MarianTransformer.pretrained("burmese_translation_helsinki","en") \
.setInputCols(["sentence"]) \
.setOutputCol("embeddings")

pipeline = Pipeline().setStages([documentAssembler, sentenceDL, marian])
data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text")
pipelineModel = pipeline.fit(data)
pipelineDF = pipelineModel.transform(data)

```
```scala

val documentAssembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")

val marian = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")
.setInputCols(Array("document"))
.setOutputCol("sentence")

val embeddings = MarianTransformer.pretrained("burmese_translation_helsinki","en")
.setInputCols(Array("sentence"))
.setOutputCol("translation")

val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, marian))
val data = Seq("I love spark-nlp").toDF("text")
val pipelineModel = pipeline.fit(data)
val pipelineDF = pipelineModel.transform(data)

```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|burmese_translation_helsinki|
|Compatibility:|Spark NLP 5.5.0+|
|License:|Open Source|
|Edition:|Official|
|Input Labels:|[sentences]|
|Output Labels:|[translation]|
|Language:|en|
|Size:|474.8 MB|

## References

https://huggingface.co/duwuonline/my-translation-helsinki
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
---
layout: model
title: English distilroberta_sst2_pipeline pipeline RoBertaForSequenceClassification from gokuls
author: John Snow Labs
name: distilroberta_sst2_pipeline
date: 2024-09-03
tags: [en, open_source, pipeline, onnx]
task: Text Classification
language: en
edition: Spark NLP 5.5.0
spark_version: 3.0
supported: true
annotator: PipelineModel
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_sst2_pipeline` is a English model originally trained by gokuls.

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_sst2_pipeline_en_5.5.0_3.0_1725369518559.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_sst2_pipeline_en_5.5.0_3.0_1725369518559.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python

pipeline = PretrainedPipeline("distilroberta_sst2_pipeline", lang = "en")
annotations = pipeline.transform(df)

```
```scala

val pipeline = new PretrainedPipeline("distilroberta_sst2_pipeline", lang = "en")
val annotations = pipeline.transform(df)

```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|distilroberta_sst2_pipeline|
|Type:|pipeline|
|Compatibility:|Spark NLP 5.5.0+|
|License:|Open Source|
|Edition:|Official|
|Language:|en|
|Size:|308.6 MB|

## References

https://huggingface.co/gokuls/distilroberta-sst2

## Included Models

- DocumentAssembler
- TokenizerModel
- RoBertaForSequenceClassification
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
---
layout: model
title: English opus_maltese_thai_english_finetuned_english_tonga_tonga_islands_thai_theetawat MarianTransformer from Theetawat
author: John Snow Labs
name: opus_maltese_thai_english_finetuned_english_tonga_tonga_islands_thai_theetawat
date: 2024-09-03
tags: [en, open_source, onnx, translation, marian]
task: Translation
language: en
edition: Spark NLP 5.5.0
spark_version: 3.0
supported: true
engine: onnx
annotator: MarianTransformer
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained MarianTransformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`opus_maltese_thai_english_finetuned_english_tonga_tonga_islands_thai_theetawat` is a English model originally trained by Theetawat.

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/opus_maltese_thai_english_finetuned_english_tonga_tonga_islands_thai_theetawat_en_5.5.0_3.0_1725345502547.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/opus_maltese_thai_english_finetuned_english_tonga_tonga_islands_thai_theetawat_en_5.5.0_3.0_1725345502547.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python

documentAssembler = DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("document")

sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \
.setInputCols(["document"]) \
.setOutputCol("translation")

marian = MarianTransformer.pretrained("opus_maltese_thai_english_finetuned_english_tonga_tonga_islands_thai_theetawat","en") \
.setInputCols(["sentence"]) \
.setOutputCol("embeddings")

pipeline = Pipeline().setStages([documentAssembler, sentenceDL, marian])
data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text")
pipelineModel = pipeline.fit(data)
pipelineDF = pipelineModel.transform(data)

```
```scala

val documentAssembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")

val marian = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")
.setInputCols(Array("document"))
.setOutputCol("sentence")

val embeddings = MarianTransformer.pretrained("opus_maltese_thai_english_finetuned_english_tonga_tonga_islands_thai_theetawat","en")
.setInputCols(Array("sentence"))
.setOutputCol("translation")

val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, marian))
val data = Seq("I love spark-nlp").toDF("text")
val pipelineModel = pipeline.fit(data)
val pipelineDF = pipelineModel.transform(data)

```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|opus_maltese_thai_english_finetuned_english_tonga_tonga_islands_thai_theetawat|
|Compatibility:|Spark NLP 5.5.0+|
|License:|Open Source|
|Edition:|Official|
|Input Labels:|[sentences]|
|Output Labels:|[translation]|
|Language:|en|
|Size:|524.2 MB|

## References

https://huggingface.co/Theetawat/opus-mt-th-en-finetuned-en-to-th
Loading

0 comments on commit 6eb7df9

Please sign in to comment.