Skip to content

Commit

Permalink
2024-09-06-xlm_roberta_base_finetuned_panx_german_ahmad_alismail_pipe…
Browse files Browse the repository at this point in the history
…line_en (#14401)

* Add model 2024-09-17-whisper_base_portuguese_zuazo_pipeline_pt

* Add model 2024-09-16-code_human_ai_pipeline_en

* Add model 2024-09-12-gal_ner_xlmr_2_pipeline_en

* Add model 2024-09-16-kinyaroberta_large_kinte_finetuned_kinyarwanda_tweet_finetuned_kinyarwanda_sent3_en

* Add model 2024-09-09-roberta_base_catalan_v2_ca

* Add model 2024-09-13-sent_bert_base_nli_ct_en

* Add model 2024-09-14-whisper_small_urdu_omar47_pipeline_ur

* Add model 2024-09-12-xlm_roberta_base_finetuned_panx_german_r45289_pipeline_en

* Add model 2024-09-17-xlm_roberta_base_operator_en

* Add model 2024-09-12-bert_resume_classification_en

* Add model 2024-09-07-distillber_squadv2_pipeline_en

* Add model 2024-09-09-f_roberta_classifier2_pipeline_en

* Add model 2024-09-08-self_harm_bert_en

* Add model 2024-09-13-roberta_base_epoch_72_en

* Add model 2024-09-17-hfa_poly_english_small_pipeline_en

* Add model 2024-09-17-bert_base_uncased_ep_1_12_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_false_fh_false_hs_200_en

* Add model 2024-09-16-bertweet_large_epoch6_batch4_lr2e_05_w0_005_pipeline_en

* Add model 2024-09-13-gqa_roberta_german_legal_squad_part_augmented_2000_pipeline_de

* Add model 2024-09-17-whisper_small_ndonga_pipeline_en

* Add model 2024-09-17-malwhisper_v1_small_ml

* Add model 2024-09-17-whisper_tiny_indonesian_evanarlian_id

* Add model 2024-09-12-roberta_base_finetuned_squad_f_arnold_en

* Add model 2024-09-16-opus_maltese_english_romanian_finetuned_romanian_tonga_tonga_islands_english_pontifexmaximus_pipeline_en

* Add model 2024-09-17-whisper_small_marathi_steja_mr

* Add model 2024-09-15-whisper_tiny_turkish_ckandemir_tr

* Add model 2024-09-11-babyberta_aochildes_2_5m_wikipedia1_2_5m_with_masking_seed3_finetuned_squad_en

* Add model 2024-09-09-q2d_origin_re_5_en

* Add model 2024-09-15-kaz_roberta_base_ft_qa_turkish_maltese_tonga_tonga_islands_kaz_pipeline_kk

* Add model 2024-09-16-nerd_nerd_random0_seed0_twitter_roberta_base_dec2020_en

* Add model 2024-09-17-whisper_base_catalan_pipeline_ca

* Add model 2024-09-17-bsc_bio_ehr_spanish_drugtemist_pipeline_es

* Add model 2024-09-17-whisper_small_basque_cv16_1_eu

* Add model 2024-09-17-whisper_small_basque_cv16_1_pipeline_eu

* Add model 2024-09-16-dataequity_opus_maltese_spanish_english_pipeline_en

* Add model 2024-09-11-transfer_course_distilroberta_base_mrpc_glue_nestor_mamani_en

* Add model 2024-09-15-xlmr_romanian_english_all_shuffled_42_test1000_en

* Add model 2024-09-15-distilbert_base_uncased_finetuned_emotion_deionk_en

* Add model 2024-09-08-lab1_random_chenxin0903_pipeline_en

* Add model 2024-09-07-xlmroberta_ner_xlm_roberta_base_finetuned_panx_ner_pipeline_it

* Add model 2024-09-14-sent_hinglish_sbert_en

* Add model 2024-09-09-opus_maltese_english_romanian_finetuned_english_tonga_tonga_islands_romanian_finetuned_english_tonga_tonga_islands_romanian_en

* Add model 2024-09-09-opus_maltese_finetuned_english_spanish_en

* Add model 2024-09-14-opus_maltese_italian_english_bds_en

* Add model 2024-09-16-vanilla_dermat_es

* Add model 2024-09-08-burmese_awesome_wnut_model_jaepax_pipeline_en

* Add model 2024-09-09-xlm_roberta_base_finetuned_marc_begar_pipeline_en

* Add model 2024-09-13-splade_pp_english_v2_en

* Add model 2024-09-11-gpu1_pipeline_en

* Add model 2024-09-13-sent_bert_medium_arabic_pipeline_ar

* Add model 2024-09-14-marian_finetuned_kde4_english_tonga_tonga_islands_french_accelerate_huggingface_course_en

* Add model 2024-09-12-finetuned_twitter_targeted_insult_roberta_en

* Add model 2024-09-12-flat_model_pipeline_en

* Add model 2024-09-11-xlm_roberta_base_finetuned_panx_french_sungkwangjoong_pipeline_en

* Add model 2024-09-12-hate_hate_balance_random3_seed0_twitter_roberta_base_2021_124m_en

* Add model 2024-09-15-distilbert_base_uncased_odm_zphr_0st2sd_ut72ut1_plprefix0stlarge2_simsp400_clean100_pipeline_en

* Add model 2024-09-09-chai_reward_deberta_classifier_en

* Add model 2024-09-09-xlm_roberta_base_finetuned_panx_german_italian_en

* Add model 2024-09-17-your_model_name_en

* Add model 2024-09-17-your_model_name_pipeline_en

* Add model 2024-09-13-khipu_finetuned_amazon_reviews_multi_andrescastro_itm_en

* Add model 2024-09-11-burmese_awesome_qa_model_ih138_en

* Add model 2024-09-09-deberta_disaster_tweet_recognizer_pipeline_en

* Add model 2024-09-09-opus_maltese_turkish_tonga_tonga_islands_english_pipeline_en

* Add model 2024-09-17-bangla_asr_v7_pipeline_bn

* Add model 2024-09-15-distilbert_base_uncased_odm_zphr_0st4sd_ut72ut5_plprefix0stlarge4_simsp100_clean300_en

* Add model 2024-09-17-bert_base_squad_v1_1_portuguese_ibama_v0_220240904182329_en

* Add model 2024-09-11-bsc_bio_ehr_spanish_vih_juicio_anam_urgen_en

* Add model 2024-09-15-xlm_roberta_base_finetuned_panx_italian_scionk_pipeline_en

* Add model 2024-09-12-finetuning_emotion_model_surajmahapatra_pipeline_en

* Add model 2024-09-15-distilbert_finetuned_custom_pipeline_en

* Add model 2024-09-13-msc_baseline_marian_pipeline_en

* Add model 2024-09-17-whisper_small_bengali_crblp_pipeline_bn

* Add model 2024-09-17-whisper_small_custom300_1e_5_va2000_pipeline_en

* Add model 2024-09-15-takalane_ssw_roberta_pipeline_tn

* Add model 2024-09-14-burmese_awesome_qa_model_sazara_pipeline_en

* Add model 2024-09-16-definition_classification_v1_en

* Add model 2024-09-17-whisper_base_portuguese_zuazo_pt

* Add model 2024-09-10-burmese_nepal_bhasa_model_pipeline_en

* Add model 2024-09-13-deberta_v3_large_survey_related_passage_consistency_rater_half_gpt4_pipeline_en

* Add model 2024-09-10-xlm_roberta_base_finetuned_panx_german_french_ericklerouge123_pipeline_en

* Add model 2024-09-17-whisper4_en

* Add model 2024-09-17-workstation_whisper_base_finetune_teacher__babble_noise_mozilla_100_epochs_batch_4_pipeline_en

* Add model 2024-09-17-hate_hate_balance_random3_seed2_bernice_pipeline_en

* Add model 2024-09-08-roberta_base_climate_evidence_related_en

* Add model 2024-09-05-question_answering_xlm_roberta_base_pipeline_en

* Add model 2024-09-17-dipromats_subtask_1_base_train_en

* Add model 2024-09-17-whisper_cli_dropout_small_oriya_pipeline_or

* Add model 2024-09-13-distilbert_base_uncased_finetuned_emotion_xiumu1988_en

* Add model 2024-09-07-hupd_distilroberta_base_pipeline_en

* Add model 2024-09-17-whisper_small_yoruba_kaggle_train_pipeline_en

* Add model 2024-09-09-distilbert_base_uncased_finetuned_emotion_saneryi_en

* Add model 2024-09-09-arabic2_en

* Add model 2024-09-13-culturebank_controversial_classifier_en

* Add model 2024-09-14-opus_maltese_indonesian_english_jakarta_best_loss_bleu_en

* Add model 2024-09-15-text_clf_model_v03_pipeline_en

* Add model 2024-09-13-sent_bulbert_chitanka_model_pipeline_bg

* Add model 2024-09-17-whisper_small_korean_yspeed_hi

* Add model 2024-09-15-xlm_roberta_base_finetuned_panx_french_haesun_en

* Add model 2024-09-16-english_tonga_tonga_islands_arabic_version3_en

* Add model 2024-09-17-ep15_pipeline_en

* Add model 2024-09-17-bert_base_uncased_ep_1_12_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_false_fh_false_hs_200_pipeline_en

* Add model 2024-09-15-distilbert_base_uncased_finetuned_emrqa_msquad_pipeline_en

* Add model 2024-09-16-multiple_languages_coptic_english_norm_group_greekified_pipeline_en

* Add model 2024-09-11-ternary_persian_sentiment_analysis_pipeline_en

* Add model 2024-09-15-efficient_mlm_m0_15_801010_pipeline_en

* Add model 2024-09-17-whisper_small_arnw_ar

* Add model 2024-09-17-roberta_tagalog_base_ft_udpos213_serbian_pipeline_tl

* Add model 2024-09-17-whisper_small_swedish_v4_sv

* Add model 2024-09-17-whisper_small_hindi_mukund017_hi

* Add model 2024-09-17-whisper_tiny_engmed_v2_pipeline_en

* Add model 2024-09-14-whisper_small_cebtoeng_hi

* Add model 2024-09-17-whisper_small_galician_zuazo_gl

* Add model 2024-09-17-whisper_tiny_italian_6_it

* Add model 2024-09-17-whisper_small_singlish_augmented_again_1200steps_en

* Add model 2024-09-08-cross_encoder_russian_msmarco_ru

* Add model 2024-09-17-whisper_tiny_minds_malikibrar_pipeline_en

* Add model 2024-09-06-task_implicit_task__model_deberta__aug_method_ri_en

* Add model 2024-09-11-xlm_roberta_base_mapa_coarse_ner_en

* Add model 2024-09-13-bert_large_uncased_sst2_pipeline_en

* Add model 2024-09-11-burmese_awesome_model_hannestt_pipeline_en

* Add model 2024-09-15-custom_peft_whiper_small_korean_v3_en

* Add model 2024-09-14-xlm_roberta_base_finetuned_panx_french_kata958_pipeline_en

* Add model 2024-09-13-sms_spam_model_v1_2_en

* Add model 2024-09-17-whisper_tiny_minds14_english_bayerasif_pipeline_en

* Add model 2024-09-12-dken_en

* Add model 2024-09-15-distilbert_base_uncased_finetuned_emotion_maydogdu_pipeline_en

* Add model 2024-09-12-datosw_v1_2_pipeline_en

* Add model 2024-09-12-marian_finetuned_kde4_english_tonga_tonga_islands_french_vonewman_pipeline_en

* Add model 2024-09-16-emoji_emoji_random1_seed2_twitter_roberta_base_2021_124m_pipeline_en

* Add model 2024-09-15-distilbert_sanskrit_saskta_glue_experiment_logit_kd_pretrain_stsb_pipeline_en

* Add model 2024-09-16-opus_maltese_english_romanian_finetuned_english_tonga_tonga_islands_romanian_diegoalysson_en

* Add model 2024-09-16-opus_maltese_semitic_languages_english_finetuned_npomo_english_15_epochs_pipeline_en

* Add model 2024-09-09-output_sotseth_pipeline_en

* Add model 2024-09-17-base_english_combined_v4_2_0_1_8_1e_05_dulcet_sweep_34_en

* Add model 2024-09-17-whisper_base_chuvash_highlr_czech_cs

* Add model 2024-09-11-fine_tune_spatial_pipeline_en

* Add model 2024-09-12-xlm_roberta_base_finetuned_panx_all_hirosay_pipeline_en

* Add model 2024-09-09-bge_base_financial_matryoshka_dpokhrel_en

* Add model 2024-09-07-study_dummy_en

* Add model 2024-09-17-whisper_small_ne2_1_en

* Add model 2024-09-11-saved_model_body_pipeline_en

* Add model 2024-09-15-mini_text_classification_finetune_model_pipeline_en

* Add model 2024-09-15-furina_seed42_eng_kinyarwanda_amh_cross_0_0001_en

* Add model 2024-09-16-opus_maltese_english_russian_finetuned_pipeline_en

* Add model 2024-09-15-finetuning_sentiment_model_3000_samples_klumdedum_en

* Add model 2024-09-10-distilbert_base_uncased_fillmask_finetuned_imdb_classifier_nlp_course_chapter7_section2_pipeline_en

* Add model 2024-09-10-xlmr_finetuned_igbo_en

* Add model 2024-09-16-focaltrain_pipeline_en

* Add model 2024-09-17-tinymax_pipeline_en

* Add model 2024-09-12-amharicqa_roberta_en

* Add model 2024-09-14-xlm_roberta_base_finetuned_panx_german_ysige_pipeline_en

* Add model 2024-09-15-whisper_small_urdu_howmannymore_pipeline_en

* Add model 2024-09-14-finetuned_marianmtmodel_v4_specialfrom_ccmatrix77k_en

* Add model 2024-09-14-model_for_french_pipeline_en

* Add model 2024-09-09-best_model_yelp_polarity_64_42_en

* Add model 2024-09-09-opus_maltese_english_german_bds_pipeline_en

* Add model 2024-09-15-cuatr_distilbert_en

* Add model 2024-09-12-xlm_roberta_base_finetuned_panx_french_jbreunig_pipeline_en

* Add model 2024-09-10-whisper_small_tuned_en

* Add model 2024-09-17-malasar_luke_dict_nan

* Add model 2024-09-17-tiny_english_uva_chunked_with_synthetic_v2_4_1e_05_pipeline_en

* Add model 2024-09-17-whisper_tiny_spanish_herme_pipeline_es

* Add model 2024-09-10-incremental_semi_supervised_training_base_pipeline_en

* Add model 2024-09-14-shami2english_en

* Add model 2024-09-16-roberta_large_mnli_fld_pipeline_en

* Add model 2024-09-10-roberta_base_bne_squad2_spanish_es

* Add model 2024-09-15-fine_tuned_distilbert_isha31101999_pipeline_en

* Add model 2024-09-17-whisper_base_chinese_cer_zh

* Add model 2024-09-15-distilbert_base_uncased_finetuned_squad_wendywangwww_pipeline_en

* Add model 2024-09-08-distilbert_base_uncased_finetuned_ner_cnguyenta_en

* Add model 2024-09-17-whisper_small_hungarian_gyikesz_pipeline_hu

* Add model 2024-09-13-all_roberta_large_v1_travel_3_16_5_en

* Add model 2024-09-12-opus_maltese_english_spanish_finetuned_english_tonga_tonga_islands_spanish_tamil_5epochs_pipeline_en

* Add model 2024-09-10-cross_all_bs160_allneg_finetuned_webnlg2020_relevance_en

* Add model 2024-09-12-whisper_small_cv17_hungarian_hu

* Add model 2024-09-17-whisper_tiny_julienchoukroun_en

* Add model 2024-09-15-scenario_tcr_4_data_cardiffnlp_tweet_sentiment_multilingual_all_pipeline_xx

* Add model 2024-09-14-finroberta_pipeline_en

* Add model 2024-09-11-opus_maltese_english_bkm_final_60_pipeline_en

* Add model 2024-09-16-lab1_finetuning_den_sota_en

* Add model 2024-09-17-breeze_dsw_tiny_indonesian_id

* Add model 2024-09-17-breeze_dsw_tiny_indonesian_pipeline_id

* Add model 2024-09-13-roberta_base_epoch_60_en

* Add model 2024-09-05-ae_detection_distilbert_pipeline_en

* Add model 2024-09-17-whisper_small_train_v2_1_en

* Add model 2024-09-11-opus_maltese_english_romanian_finetuned_english_tonga_tonga_islands_romanian_redpandaainlp_en

* Add model 2024-09-17-whisper_small_train_v2_1_pipeline_en

* Add model 2024-09-12-burmese_awesome_model_willw9758_pipeline_en

* Add model 2024-09-08-recipes_trainer_n_sentences_per_recipe_3_sep_true_pipeline_en

* Add model 2024-09-17-whisper_tiny_ga2en_v1_4_ga

* Add model 2024-09-17-whisper_small_arnw_pipeline_ar

* Add model 2024-09-16-turkish_medical_field_detection_8_pipeline_en

* Add model 2024-09-16-roberta_large_finetuned_cola_cvapict_en

* Add model 2024-09-17-whisper_small_hindi_mukund017_pipeline_hi

* Add model 2024-09-17-whisper_small_divehi_agercas_dv

* Add model 2024-09-17-roberta_large_metaie_super_academia_gpt4o_pipeline_en

* Add model 2024-09-08-schemeclassifier_eng_en

* Add model 2024-09-14-whisper_small_russian_1k_steps_ru

* Add model 2024-09-14-luxembert_v2_en

* Add model 2024-09-10-roberta_base_coqa_en

* Add model 2024-09-15-grammar_classifier_pipeline_en

* Add model 2024-09-14-whisper_medium_uzbek_extra_dataset_v2_en

* Add model 2024-09-10-xlm_roberta_base_finetuned_panx_all_youngbreadho_en

* Add model 2024-09-14-whisper_small3_italian_it

* Add model 2024-09-12-first_qa_model_pipeline_en

* Add model 2024-09-15-whisper_base_cer_gn

* Add model 2024-09-15-stego_classifier_checkpoint_epoch_30_2024_07_26_12_23_45_pipeline_en

* Add model 2024-09-09-opus_maltese_english_dutch_finetuned_combined_38_train_val_pipeline_en

* Add model 2024-09-07-idiom_xlm_roberta_en

* Add model 2024-09-06-dummy_model_fab7_pipeline_en

* Add model 2024-09-09-chai_deberta_v3_base_reward_model_pipeline_en

* Add model 2024-09-11-distilbert_sarcascm_classifier_en

* Add model 2024-09-17-chinese_roberta_wwm_ext_2_0_8_ddp_en

* Add model 2024-09-07-dummy_model_hanzhuo_pipeline_en

* Add model 2024-09-10-training_v2_ru

* Add model 2024-09-17-metaqa_en

* Add model 2024-09-17-metaqa_pipeline_en

* Add model 2024-09-16-opus_maltese_slavic_languages_english_finetuned_ukrainian_tonga_tonga_islands_english_pipeline_en

* Add model 2024-09-17-predict_perception_xlmr_cause_object_en

* Add model 2024-09-15-distilbert_finetuned_squadv2_mf212_en

* Add model 2024-09-12-quran_whisper_tiny_v1_ar

* Add model 2024-09-13-bert_vllm_gemma2b_7_pipeline_en

* Add model 2024-09-11-all_mpnet_base_v2_2022_11_07_pipeline_en

* Add model 2024-09-17-bert_base_german_cased_finetuned_squad_en

* Add model 2024-09-14-opus_maltese_english_romanian_finetuned_english_tonga_tonga_islands_romanian_minzzi_en

* Add model 2024-09-17-xlm_roberta_base_vtoc_100_pipeline_en

* Add model 2024-09-11-phishing_email_detection_21_07_en

* Add model 2024-09-09-lab1_random_reshphil_en

* Add model 2024-09-09-maltese_coref_english_arabic_gender_exp_pipeline_en

* Add model 2024-09-15-roberta_tuned_trial_13_13_2022_en

* Add model 2024-09-15-models_mil00_pipeline_en

* Add model 2024-09-08-roberta_base_squad2_finetuned_squad_katxtong_en

* Add model 2024-09-08-sent_xlm_roberta_base_finetuned_burmese_dear_watson2_pipeline_en

* Add model 2024-09-17-burmese_awesome_qa_model_nada_ghazouani_pipeline_en

* Add model 2024-09-17-finetuned_bert_model_squad_datset_pipeline_en

* Add model 2024-09-17-distilbert_base_uncased_squad2_p35_pipeline_en

* Add model 2024-09-17-distilbert_base_uncased_finetuned_squad_hashemghanem_pipeline_en

* Add model 2024-09-17-burmese_awesome_qa_model_lash_en

* Add model 2024-09-17-burmese_awesome_qa_model_lash_pipeline_en

* Add model 2024-09-12-italian_emotion_analyzer_it

* Add model 2024-09-17-distilbert_base_uncased_finetuned_squad_d5716d28_serhii_korobchenko_pipeline_en

* Add model 2024-09-17-distilbert_base_cased_distilled_squad_full_lora_merged_en

* Add model 2024-09-11-multi_qa_mpnet_base_dot_v1_covidqa_search_75_25_2epoch_full_en

* Add model 2024-09-11-klue_bert_base_sentiment_pipeline_ko

* Add model 2024-09-17-burmese_qa_model_yadah_pipeline_en

* Add model 2024-09-17-whisper_small_yoruba_kaggle_train_en

* Add model 2024-09-15-tuf_albert_5e_en

* Add model 2024-09-09-quberta_qu

* Add model 2024-09-17-whisper_tiny_chinese_zhihcheng_pipeline_zh

* Add model 2024-09-11-distilbert_base_uncased_finetuned_squad_test2_en

---------

Co-authored-by: ahmedlone127 <ahmedlone127@gmail.com>
  • Loading branch information
jsl-models and ahmedlone127 authored Sep 17, 2024
1 parent fc72501 commit bf76942
Show file tree
Hide file tree
Showing 1,372 changed files with 110,508 additions and 0 deletions.
94 changes: 94 additions & 0 deletions docs/_posts/ahmedlone127/2024-09-02-bulbert_chitanka_model_bg.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
---
layout: model
title: Bulgarian bulbert_chitanka_model BertEmbeddings from mor40
author: John Snow Labs
name: bulbert_chitanka_model
date: 2024-09-02
tags: [bg, open_source, onnx, embeddings, bert]
task: Embeddings
language: bg
edition: Spark NLP 5.5.0
spark_version: 3.0
supported: true
engine: onnx
annotator: BertEmbeddings
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bulbert_chitanka_model` is a Bulgarian model originally trained by mor40.

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bulbert_chitanka_model_bg_5.5.0_3.0_1725318518639.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bulbert_chitanka_model_bg_5.5.0_3.0_1725318518639.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python

documentAssembler = DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("document")

tokenizer = Tokenizer() \
.setInputCols("document") \
.setOutputCol("token")

embeddings = BertEmbeddings.pretrained("bulbert_chitanka_model","bg") \
.setInputCols(["document", "token"]) \
.setOutputCol("embeddings")

pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings])
data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text")
pipelineModel = pipeline.fit(data)
pipelineDF = pipelineModel.transform(data)

```
```scala

val documentAssembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")

val tokenizer = new Tokenizer()
.setInputCols(Array("document"))
.setOutputCol("token")

val embeddings = BertEmbeddings.pretrained("bulbert_chitanka_model","bg")
.setInputCols(Array("document", "token"))
.setOutputCol("embeddings")

val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings))
val data = Seq("I love spark-nlp").toDF("text")
val pipelineModel = pipeline.fit(data)
val pipelineDF = pipelineModel.transform(data)

```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|bulbert_chitanka_model|
|Compatibility:|Spark NLP 5.5.0+|
|License:|Open Source|
|Edition:|Official|
|Input Labels:|[document, token]|
|Output Labels:|[bert]|
|Language:|bg|
|Size:|306.1 MB|

## References

https://huggingface.co/mor40/BulBERT-chitanka-model
94 changes: 94 additions & 0 deletions docs/_posts/ahmedlone127/2024-09-03-oorito_en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
---
layout: model
title: English oorito MarianTransformer from LRJ1981
author: John Snow Labs
name: oorito
date: 2024-09-03
tags: [en, open_source, onnx, translation, marian]
task: Translation
language: en
edition: Spark NLP 5.5.0
spark_version: 3.0
supported: true
engine: onnx
annotator: MarianTransformer
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained MarianTransformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`oorito` is a English model originally trained by LRJ1981.

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/oorito_en_5.5.0_3.0_1725404166090.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/oorito_en_5.5.0_3.0_1725404166090.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python

documentAssembler = DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("document")

sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \
.setInputCols(["document"]) \
.setOutputCol("translation")

marian = MarianTransformer.pretrained("oorito","en") \
.setInputCols(["sentence"]) \
.setOutputCol("embeddings")

pipeline = Pipeline().setStages([documentAssembler, sentenceDL, marian])
data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text")
pipelineModel = pipeline.fit(data)
pipelineDF = pipelineModel.transform(data)

```
```scala

val documentAssembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")

val marian = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")
.setInputCols(Array("document"))
.setOutputCol("sentence")

val embeddings = MarianTransformer.pretrained("oorito","en")
.setInputCols(Array("sentence"))
.setOutputCol("translation")

val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, marian))
val data = Seq("I love spark-nlp").toDF("text")
val pipelineModel = pipeline.fit(data)
val pipelineDF = pipelineModel.transform(data)

```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|oorito|
|Compatibility:|Spark NLP 5.5.0+|
|License:|Open Source|
|Edition:|Official|
|Input Labels:|[sentences]|
|Output Labels:|[translation]|
|Language:|en|
|Size:|504.7 MB|

## References

https://huggingface.co/LRJ1981/OORito
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
---
layout: model
title: English deberta_v3_large_survey_nepal_bhasa_fact_main_passage_rater_gpt4 DeBertaForSequenceClassification from domenicrosati
author: John Snow Labs
name: deberta_v3_large_survey_nepal_bhasa_fact_main_passage_rater_gpt4
date: 2024-09-04
tags: [en, open_source, onnx, sequence_classification, deberta]
task: Text Classification
language: en
edition: Spark NLP 5.5.0
spark_version: 3.0
supported: true
engine: onnx
annotator: DeBertaForSequenceClassification
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained DeBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deberta_v3_large_survey_nepal_bhasa_fact_main_passage_rater_gpt4` is a English model originally trained by domenicrosati.

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deberta_v3_large_survey_nepal_bhasa_fact_main_passage_rater_gpt4_en_5.5.0_3.0_1725440063827.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deberta_v3_large_survey_nepal_bhasa_fact_main_passage_rater_gpt4_en_5.5.0_3.0_1725440063827.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python

documentAssembler = DocumentAssembler() \
.setInputCol('text') \
.setOutputCol('document')

tokenizer = Tokenizer() \
.setInputCols(['document']) \
.setOutputCol('token')

sequenceClassifier = DeBertaForSequenceClassification.pretrained("deberta_v3_large_survey_nepal_bhasa_fact_main_passage_rater_gpt4","en") \
.setInputCols(["documents","token"]) \
.setOutputCol("class")

pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier])
data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text")
pipelineModel = pipeline.fit(data)
pipelineDF = pipelineModel.transform(data)

```
```scala

val documentAssembler = new DocumentAssembler()
.setInputCols("text")
.setOutputCols("document")

val tokenizer = new Tokenizer()
.setInputCols(Array("document"))
.setOutputCol("token")

val sequenceClassifier = DeBertaForSequenceClassification.pretrained("deberta_v3_large_survey_nepal_bhasa_fact_main_passage_rater_gpt4", "en")
.setInputCols(Array("documents","token"))
.setOutputCol("class")

val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier))
val data = Seq("I love spark-nlp").toDS.toDF("text")
val pipelineModel = pipeline.fit(data)
val pipelineDF = pipelineModel.transform(data)

```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|deberta_v3_large_survey_nepal_bhasa_fact_main_passage_rater_gpt4|
|Compatibility:|Spark NLP 5.5.0+|
|License:|Open Source|
|Edition:|Official|
|Input Labels:|[document, token]|
|Output Labels:|[class]|
|Language:|en|
|Size:|1.5 GB|

## References

https://huggingface.co/domenicrosati/deberta-v3-large-survey-new_fact_main_passage-rater-gpt4
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
---
layout: model
title: English dummy_model_umalakshmi07_pipeline pipeline CamemBertEmbeddings from Umalakshmi07
author: John Snow Labs
name: dummy_model_umalakshmi07_pipeline
date: 2024-09-04
tags: [en, open_source, pipeline, onnx]
task: Embeddings
language: en
edition: Spark NLP 5.5.0
spark_version: 3.0
supported: true
annotator: PipelineModel
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained CamemBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dummy_model_umalakshmi07_pipeline` is a English model originally trained by Umalakshmi07.

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dummy_model_umalakshmi07_pipeline_en_5.5.0_3.0_1725409109729.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dummy_model_umalakshmi07_pipeline_en_5.5.0_3.0_1725409109729.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python

pipeline = PretrainedPipeline("dummy_model_umalakshmi07_pipeline", lang = "en")
annotations = pipeline.transform(df)

```
```scala

val pipeline = new PretrainedPipeline("dummy_model_umalakshmi07_pipeline", lang = "en")
val annotations = pipeline.transform(df)

```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|dummy_model_umalakshmi07_pipeline|
|Type:|pipeline|
|Compatibility:|Spark NLP 5.5.0+|
|License:|Open Source|
|Edition:|Official|
|Language:|en|
|Size:|264.0 MB|

## References

https://huggingface.co/Umalakshmi07/dummy-model

## Included Models

- DocumentAssembler
- TokenizerModel
- CamemBertEmbeddings
Loading

0 comments on commit bf76942

Please sign in to comment.