Skip to content

Commit

Permalink
2024-09-18-imdb_0_pipeline_en (#14405)
Browse files Browse the repository at this point in the history
* Add model 2024-09-14-transcriber_small_en

* Add model 2024-09-09-roberta_qa_model_10k_pipeline_en

* Add model 2024-09-22-whisper_small_persian_farsi_javadr_fa

* Add model 2024-09-22-darkstar_bert_ome1_pipeline_en

* Add model 2024-09-12-opus_maltese_english_indonesian_jakarta_best_loss_bleu_en

* Add model 2024-09-21-sent_bert_base_uncased_finetuned_wallisian_manual_3ep_lower_en

* Add model 2024-09-12-english_zhtw_en

* Add model 2024-09-22-whisper_small_italian_edoabati_pipeline_it

* Add model 2024-09-22-whisper_small_macedonian_pipeline_mk

* Add model 2024-09-13-bulbert_wiki_bulgarian_pipeline_en

* Add model 2024-09-08-response_toxicity_classifier_base_pipeline_ru

* Add model 2024-09-22-bert_base_cased_plane_ood_2_en

* Add model 2024-09-22-cold_fusion_itr2_seed4_en

* Add model 2024-09-19-distilbert_base_uncased_md_gender_bias_trained_pipeline_en

* Add model 2024-09-19-distilbert_base_uncased_fold_4_pipeline_en

* Add model 2024-09-22-whisper_small_hre3_pipeline_en

* Add model 2024-09-21-bert_base_uncased_mrpc_serjssv_pipeline_en

* Add model 2024-09-22-whisper_small_hindi_dv

* Add model 2024-09-22-whisper_small_hindi_pipeline_dv

* Add model 2024-09-16-roberta_augmented_finetuned_atis_1pct_v2_en

* Add model 2024-09-20-covid_roberta_60_masked_pipeline_en

* Add model 2024-09-22-voicemath_tiny_pipeline_en

* Add model 2024-09-20-distilbert_base_uncased_finetuned_emotion_raota_pipeline_en

* Add model 2024-09-19-hausa_sentiment_analysis_ha

* Add model 2024-09-13-qa_model_zehralx_en

* Add model 2024-09-20-roberta_topic_pipeline_en

* Add model 2024-09-18-divya_resume_model_pipeline_en

* Add model 2024-09-10-all_mpnet_lr1e_8_margin_1_bosnian_32_en

* Add model 2024-09-19-xlm_roberta_base_finetuned_panx_german_duykha0511_pipeline_en

* Add model 2024-09-10-roberta_tagalog_base_pipeline_en

* Add model 2024-09-22-genztranscribe_base_hindi_en

* Add model 2024-09-18-burmese_model_parsawar_pipeline_en

* Add model 2024-09-21-clasificadorcorreosoportedistilespanol_dataser_en

* Add model 2024-09-20-distilbert_base_uncased_ft_2_pipeline_en

* Add model 2024-09-20-autotrain_63go1_k0lzp_pipeline_en

* Add model 2024-09-09-question_answering_hemg_en

* Add model 2024-09-22-whisper_small_indonesian_v1_pipeline_en

* Add model 2024-09-18-moviegenreprediction_pipeline_en

* Add model 2024-09-19-text_classification_sms_model_pipeline_en

* Add model 2024-09-13-xlm_robert_base_finetuned_panx_german_french_en

* Add model 2024-09-06-whisper_small_english_accented_pipeline_en

* Add model 2024-09-22-whisper_base_malayalam_ml

* Add model 2024-09-21-t_6_en

* Add model 2024-09-22-whisper_small_mongolian_dorjzodovsuren_pipeline_mn

* Add model 2024-09-22-whisper_small_mongolian_dorjzodovsuren_mn

* Add model 2024-09-22-whisper_small_uoseftalaat_en

* Add model 2024-09-22-whisper_small_uoseftalaat_pipeline_en

* Add model 2024-09-22-whisper_small_estonian_rristo_et

* Add model 2024-09-21-roberta_base_finetuned_abs_en

* Add model 2024-09-22-hp_search_deberta_pipeline_en

* Add model 2024-09-22-whisper_small_korean_eyfreq_speed_hi

* Add model 2024-09-21-whisper_tiny_malayalam_sid330_pipeline_en

* Add model 2024-09-22-cat_sayula_popoluca_iwcg_5_pipeline_en

* Add model 2024-09-22-whisper_small_irish_ga

* Add model 2024-09-21-toxic_comment_model_toxicity_ft_en

* Add model 2024-09-21-clinicalbert_aci_bench_section_classifier_pipeline_en

* Add model 2024-09-22-ner_productname_en

* Add model 2024-09-21-bert_vllm_gemma2b_llmoversight_0_5_nodropsus_1_pipeline_en

* Add model 2024-09-22-distill_whisper_jargon_btemirov_en

* Add model 2024-09-20-personality_lm_en

* Add model 2024-09-21-whisper_small_romanian_cv11_ro

* Add model 2024-09-13-distilbert_base_uncased_finetuned_squad_nahomk_en

* Add model 2024-09-18-legal_undedup_base_v1_5__checkpoint_last_pipeline_en

* Add model 2024-09-21-whisper_small_russian_lorenzoncina_ru

* Add model 2024-09-17-distilbert_base_uncased_finetuned_squad_thsohn_pipeline_en

* Add model 2024-09-21-zeli_category_en

* Add model 2024-09-08-lab1_finetuning_twobjohn_pipeline_en

* Add model 2024-09-20-trainerh2_en

* Add model 2024-09-13-bert_ner_custom_vasanth_pipeline_en

* Add model 2024-09-20-distilbert_base_uncased_finetuned_emotion_devborbot_pipeline_en

* Add model 2024-09-21-twitterfin_padding100model_pipeline_en

* Add model 2024-09-11-opus_big_enfr_ft_wang_2022_pipeline_en

* Add model 2024-09-22-whisper_small_nomo_en

* Add model 2024-09-22-whisper_small_italian_edoabati_it

* Add model 2024-09-22-whisper_small_serbian_combined_pipeline_sr

* Add model 2024-09-18-roberta_nli_group71_en

* Add model 2024-09-11-roberta_finetuned_chennaiqa_10_pipeline_en

* Add model 2024-09-21-whisper_small_persian_farsi_1k_steps_pipeline_fa

* Add model 2024-09-18-trainer2f_pipeline_en

* Add model 2024-09-20-roberta_large_ontonotes_en

* Add model 2024-09-15-multi_label_class_classification_on_github_issues_pipeline_en

* Add model 2024-09-22-nbme_roberta_large_en

* Add model 2024-09-22-robertacnnrnnfnntransformer2_en

* Add model 2024-09-22-nbme_roberta_large_pipeline_en

* Add model 2024-09-22-roberta_bert_10_unmalicious_en

* Add model 2024-09-22-burmese_awesome_eli5_mlm_model_ashdev01_pipeline_en

* Add model 2024-09-21-whisper_small_uzbek_gitnazarov_uz

* Add model 2024-09-22-roberta_poetry_religion_crpo_pipeline_en

* Add model 2024-09-22-deeppolicytracker_500k_pipeline_en

* Add model 2024-09-22-burmese_awesome_eli5_mlm_model_jesslimzhiqi_en

* Add model 2024-09-22-burmese_awesome_eli5_mlm_model_jesslimzhiqi_pipeline_en

* Add model 2024-09-19-polarizer_roberta_large_en

* Add model 2024-09-22-deeppolicytracker_500k_en

* Add model 2024-09-22-fine_tune_bert_base_cased_pipeline_en

* Add model 2024-09-22-fine_tune_bert_base_cased_en

* Add model 2024-09-20-symptom_ner_en

* Add model 2024-09-22-bert_base_multilingual_cased_finetuned_squadbn_pipeline_xx

* Add model 2024-09-17-sent_mobilebert_add_pre_training_complete_en

* Add model 2024-09-22-bert_base_multilingual_cased_finetuned_squadbn_xx

* Add model 2024-09-20-snli_roberta_large_seed_1_pipeline_en

* Add model 2024-09-18-roberta_untrained_1eps_seed291_en

* Add model 2024-09-18-xlmr_nepali_english_norwegian_shuffled_orig_test1000_en

* Add model 2024-09-17-burmese_translation_helsinki2_pipeline_en

* Add model 2024-09-21-recipes_roberta_base_norwegian_ingr_en

* Add model 2024-09-14-personal_es

* Add model 2024-09-17-distilbert_base_uncased_squad2_pruned_p35_en

* Add model 2024-09-12-221026optimizedmodel_pipeline_en

* Add model 2024-09-18-sberbank_rubert_base_collection3_pipeline_ru

* Add model 2024-09-21-base_english_combined_v4_2_0_1_16_1e_06_balmy_sweep_40_en

* Add model 2024-09-18-distilbert_base_uncased_distilled_clinc_aaa01101312_pipeline_en

* Add model 2024-09-22-whisper_small_divehi_cordwainersmith_pipeline_en

* Add model 2024-09-22-whisper_small_korean_eyfreq_speed_pipeline_hi

* Add model 2024-09-22-whisper_small_spanish_kevincrb_pipeline_es

* Add model 2024-09-22-spa_portuguese_xlm_r_es

* Add model 2024-09-20-whisper_base_atco2_en

* Add model 2024-09-21-distilbert_base_uncased_finetuned_emotion_crazymoment_pipeline_en

* Add model 2024-09-21-distilbert_amazon_software_reviews_finetuned_pipeline_en

* Add model 2024-09-20-distilbert_sanskrit_saskta_glue_experiment_data_aug_qqp_192_en

* Add model 2024-09-22-whisper_base2_pipeline_ko

* Add model 2024-09-22-sent_bert_tagalog_base_uncased_en

* Add model 2024-09-22-sent_bert_base_multilingual_cased_finetuned_kinyarwanda_xx

* Add model 2024-09-20-roberta_base_biomedical_clinical_spanish_pipeline_en

* Add model 2024-09-20-email_answer_extraction_en

* Add model 2024-09-20-symptom_ner_pipeline_en

* Add model 2024-09-16-bae_roberta_base_mrpc_5_pipeline_en

* Add model 2024-09-19-finbert_revenuefromcontractwithcustomerexcludingassessedtax_pipeline_en

* Add model 2024-09-22-imdb2_pipeline_en

* Add model 2024-09-11-stance_twi_pipeline_en

* Add model 2024-09-16-distilbert_base_uncased_finetuned_squad_maguitai_pipeline_en

* Add model 2024-09-20-ptcrawl_plus_legal_large_v1_7__checkpoint_last_pipeline_en

* Add model 2024-09-16-translator_pipeline_en

* Add model 2024-09-14-hatebertimbau_twitter_pipeline_pt

* Add model 2024-09-19-roberta_base_xshubhamx_en

* Add model 2024-09-21-distilbert_sst2_padding80model_pipeline_en

* Add model 2024-09-21-whisper_small_english_mskov_en

* Add model 2024-09-22-whisper_tiny_minds14_english_us_ghassenhannachi_en

* Add model 2024-09-17-openai_tiny_asr_handson_en

* Add model 2024-09-19-distilbert_base_uncased_finetuned_squad_d5716d28_kiwihead15_en

* Add model 2024-09-10-ner_column_bert_base_ner_pipeline_en

* Add model 2024-09-21-bert_classifier_bert_base_multilingual_uncased_sentiment_pipeline_xx

* Add model 2024-09-16-distilbert_base_uncased_finetuned_squad_sm750s_pipeline_en

* Add model 2024-09-18-burmese_finetuned_distilbert_on_imdb_pipeline_en

* Add model 2024-09-09-en2zh40_en

* Add model 2024-09-20-burmese_awesome_model_imdb_pipeline_en

* Add model 2024-09-16-xlm_roberta_emotion_detector_en

* Add model 2024-09-20-hate_hate_random2_seed0_twitter_roberta_base_dec2020_pipeline_en

* Add model 2024-09-20-distilbert_sanskrit_saskta_glue_experiment_logit_kd_data_aug_qnli_96_pipeline_en

* Add model 2024-09-22-mentalroberta_empai_final2_pipeline_en

* Add model 2024-09-21-clasificadorcorreosoportedistilespanol_dataser_pipeline_en

* Add model 2024-09-17-qa_model_manikanta_goli_pipeline_en

* Add model 2024-09-17-n_distilbert_imdb_padding70model_en

* Add model 2024-09-22-results_javierorjuela_pipeline_en

* Add model 2024-09-22-araroberta_jo_ar

* Add model 2024-09-22-robertacnnrnnfnntransformer2_pipeline_en

* Add model 2024-09-21-distilbert_base_uncased_finetuned_emotion_yerkekz_en

* Add model 2024-09-22-twitchleaguebert_260k_en

* Add model 2024-09-22-xlm_roberta_base_finetuned_panx_german_tirendaz_en

* Add model 2024-09-22-sent_bert_base_cased_model_attribution_challenge_en

* Add model 2024-09-22-sent_deberta_base_uncased_pipeline_en

* Add model 2024-09-18-withinapps_ndd_pagekit_test_content_cwadj_pipeline_en

* Add model 2024-09-21-bertouch_pipeline_en

* Add model 2024-09-20-distilbert_base_uncased_finetuned_cola_majid097_en

* Add model 2024-09-22-sent_bert_base_english_thai_cased_en

* Add model 2024-09-22-sent_hindi_bpe_bert_test_2m_en

* Add model 2024-09-19-patent_ner_test_noisyocr_version_en

* Add model 2024-09-22-sent_bert_base_english_thai_cased_pipeline_en

* Add model 2024-09-21-whisper_small_egy_pipeline_en

* Add model 2024-09-22-whisper_small_indonesian_v1_en

* Add model 2024-09-18-xlm_roberta_base_zero_shot_classifier_xnli_anli_mnli_snli_xx

* Add model 2024-09-17-whisper_base_thai_project_3_pipeline_en

* Add model 2024-09-20-distilbert_base_uncased_small_talk_zphr_0st_ut102ut1_plain_simsp_en

* Add model 2024-09-21-task_1_pipeline_en

* Add model 2024-09-21-distilbert_base_uncased_travel_zphr_0st_ut102ut1_plainprefix_simsp_en

* Add model 2024-09-15-output_en

* Add model 2024-09-21-text_classifier_madanagrawal_pipeline_en

* Add model 2024-09-17-finetuned_bert_model_squad_datset_en

* Add model 2024-09-17-whisper_italian_whispy_it

* Add model 2024-09-21-distilbert_base_uncased_finetuned_emotion_parksuna_en

* Add model 2024-09-20-roberta_base_bc2gm_en

* Add model 2024-09-22-sent_splade_v3_lexical_nirantk_en

* Add model 2024-09-18-burmese_awesome_model_duy221_en

* Add model 2024-09-17-modeltest_pipeline_en

* Add model 2024-09-14-whisper_small_vivos_jrhuy_pipeline_en

* Add model 2024-09-22-sent_indobert_base_p2_finetuned_mer_80k_pipeline_en

* Add model 2024-09-21-sent_bert_base_uncased_copy_en

* Add model 2024-09-22-sent_bert_base_multilingual_cased_finetuned_kinyarwanda_pipeline_xx

* Add model 2024-09-17-bsc_bio_ehr_spanish_symptemist_word2vec_8_ner_pipeline_en

* Add model 2024-09-17-distilbert_base_uncased_qa_model_v1_pipeline_en

* Add model 2024-09-21-bert_multi_turkish_tweet_tr

* Add model 2024-09-18-roberta_poetry_life_crpo_pipeline_en

* Add model 2024-09-18-distilbert_base_uncased_squad2_lora_test_pipeline_en

* Add model 2024-09-11-did_the_doctor_thank_the_patient_for_his_treatment_oriya_wanted_tonga_tonga_islands_belarusian_healthy_bert_last128_pipeline_en

* Add model 2024-09-19-financial_sentiment_model_1500_samples_pipeline_en

* Add model 2024-09-18-db_mc_6a_89_pipeline_en

* Add model 2024-09-18-scenario_non_kd_scr_copy_cdf_english_d2_data_english_cardiff_eng_only_gamma_pipeline_en

* Add model 2024-09-22-whisper_small_persian_farsi_javadr_pipeline_fa

* Add model 2024-09-17-xlm_roberta_base_finetuned_panx_german_athairus_pipeline_en

* Add model 2024-09-19-roberta_base_last_9_chars_acl2023_pipeline_en

* Add model 2024-09-16-qnli_distilled_bart_cross_roberta_pipeline_en

* Add model 2024-09-22-whisper_small_serbian_combined_sr

* Add model 2024-09-19-sexism_in_memes_pipeline_en

* Add model 2024-09-21-snli_6_pipeline_en

* Add model 2024-09-19-bert_finetuned_ner4_ikram11_pipeline_en

* Add model 2024-09-16-consumerresponseclassifier_pipeline_en

* Add model 2024-09-20-xnli_xlm_r_only_english_pipeline_en

* Add model 2024-09-21-bert_base_uncased_finetune_squad_ep_1_0_lr_1e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_100_pipeline_en

* Add model 2024-09-11-frabert_distilbert_base_uncased_train_en

* Add model 2024-09-19-finetuning_sentiment_model_6000_samples_pipeline_en

* Add model 2024-09-21-distillbert_8_pipeline_en

* Add model 2024-09-20-burmese_first_test_model_pipeline_en

* Add model 2024-09-17-whisper_base_catalan_ca

* Add model 2024-09-19-finetuned_fakenewsdetect_robertabasedl_en

* Add model 2024-09-22-covid_19_vaccination_tweet_stance_pipeline_en

* Add model 2024-09-20-27th2024dash1_distilbert_base_uncased_2_en

* Add model 2024-09-22-akai_flow_classifier_kmai_dev_test_bot_en

* Add model 2024-09-20-roberta_large_unlabeled_labeled_gab_reddit_task_semeval2023_t10_270000sample_en

* Add model 2024-09-20-distilbert_base_uncased_odm_zphr_0st9sd_ut72ut1large9pfxnf_simsp400_clean100_en

* Add model 2024-09-22-distilbert_base_uncased_linasaba_pipeline_en

* Add model 2024-09-22-marathi_sentiment_movie_reviews_pipeline_mr

* Add model 2024-09-16-opus_maltese_english_french_finetuned_english_tonga_tonga_islands_french_devaibest_en

* Add model 2024-09-22-bert_base_cased_finetuned_emotion_ncduy_pipeline_en

* Add model 2024-09-22-coha1940s_en

* Add model 2024-09-21-burmese_awesome_eli5_mlm_model_skotha_en

* Add model 2024-09-22-autonlp_bp_29016523_en

* Add model 2024-09-18-burmese_awesome_model_duy221_pipeline_en

* Add model 2024-09-22-bert_base_cased_finetuned_runaways_en

* Add model 2024-09-22-burmese_awesome_eli5_mlm_model_ashdev01_en

* Add model 2024-09-22-sent_deberta_base_uncased_en

* Add model 2024-09-21-bert_base_uncased_ep_10_0_b_32_lr_1_2e_06_dp_0_3_swati_0_southern_sotho_false_fh_false_hs_1000_pipeline_en

* Add model 2024-09-22-bodo_roberta_base_sentencepiece_mlm_pipeline_en

* Add model 2024-09-19-xlm_roberta_base_lr0_001_seed42_esp_kinyarwanda_eng_train_en

* Add model 2024-09-20-xlm_roberta_base_tweet_sentiment_french_trimmed_french_30000_pipeline_en

* Add model 2024-09-20-albert_base_qa_squad2_en

* Add model 2024-09-20-bert_base_uncased_finetune_squad_ep_2_25_lr_4e_07_wd_1e_05_dp_0_3_swati_0_southern_sotho_false_fh_false_hs_600_en

* Add model 2024-09-20-fine_tunning_en

* Add model 2024-09-18-0_000005_0_999_en

* Add model 2024-09-21-distilbert_sanskrit_saskta_glue_experiment_data_aug_cola_384_en

* Add model 2024-09-20-distilbert_base_uncased_finetuned_nlp_letters_s1_s2_degendered_en

* Add model 2024-09-20-ner_ner_random0_seed2_roberta_large_pipeline_en

* Add model 2024-09-17-dictabert_large_heq_he

* Add model 2024-09-17-classifier__generated_data_only__meansdetection_albert_en

* Add model 2024-09-21-burmese_awesome_model_catbult_pipeline_en

* Add model 2024-09-22-memo_bert_wsd_danskbert_en

* Add model 2024-09-20-netuid1_classification_pipeline_en

* Add model 2024-09-22-memo_bert_wsd_danskbert_pipeline_en

* Add model 2024-09-22-sent_bert_tagalog_base_uncased_pipeline_en

* Add model 2024-09-22-xlmr_english_chinese_all_shuffled_42_test1000_pipeline_en

* Add model 2024-09-22-xlm_roberta_base_xnli_german_trimmed_german_30000_en

* Add model 2024-09-21-whisper_small_indonesian_therains_pipeline_id

* Add model 2024-09-21-text_classification_10000_en

---------

Co-authored-by: ahmedlone127 <ahmedlone127@gmail.com>
  • Loading branch information
jsl-models and ahmedlone127 authored Sep 22, 2024
1 parent fef850d commit 884d929
Show file tree
Hide file tree
Showing 3,091 changed files with 250,193 additions and 0 deletions.
The diff you're trying to view is too large. We only load the first 3000 changed files.
94 changes: 94 additions & 0 deletions docs/_posts/ahmedlone127/2024-09-04-sent_tiny_biobert_en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
---
layout: model
title: English sent_tiny_biobert BertSentenceEmbeddings from nlpie
author: John Snow Labs
name: sent_tiny_biobert
date: 2024-09-04
tags: [en, open_source, onnx, sentence_embeddings, bert]
task: Embeddings
language: en
edition: Spark NLP 5.5.0
spark_version: 3.0
supported: true
engine: onnx
annotator: BertSentenceEmbeddings
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_tiny_biobert` is a English model originally trained by nlpie.

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_tiny_biobert_en_5.5.0_3.0_1725454293294.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_tiny_biobert_en_5.5.0_3.0_1725454293294.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python

documentAssembler = DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("document")

sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \
.setInputCols(["document"]) \
.setOutputCol("sentence")

embeddings = BertSentenceEmbeddings.pretrained("sent_tiny_biobert","en") \
.setInputCols(["sentence"]) \
.setOutputCol("embeddings")

pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings])
data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text")
pipelineModel = pipeline.fit(data)
pipelineDF = pipelineModel.transform(data)

```
```scala

val documentAssembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")

val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")
.setInputCols(Array("document"))
.setOutputCol("sentence")

val embeddings = BertSentenceEmbeddings.pretrained("sent_tiny_biobert","en")
.setInputCols(Array("sentence"))
.setOutputCol("embeddings")

val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings))
val data = Seq("I love spark-nlp").toDF("text")
val pipelineModel = pipeline.fit(data)
val pipelineDF = pipelineModel.transform(data)

```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|sent_tiny_biobert|
|Compatibility:|Spark NLP 5.5.0+|
|License:|Open Source|
|Edition:|Official|
|Input Labels:|[sentence]|
|Output Labels:|[embeddings]|
|Language:|en|
|Size:|51.9 MB|

## References

https://huggingface.co/nlpie/tiny-biobert
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
---
layout: model
title: English xlm_roberta_base_tweet_sentiment_portuguese_trimmed_portuguese_15000 XlmRoBertaForSequenceClassification from vocabtrimmer
author: John Snow Labs
name: xlm_roberta_base_tweet_sentiment_portuguese_trimmed_portuguese_15000
date: 2024-09-04
tags: [en, open_source, onnx, sequence_classification, xlm_roberta]
task: Text Classification
language: en
edition: Spark NLP 5.5.0
spark_version: 3.0
supported: true
engine: onnx
annotator: XlmRoBertaForSequenceClassification
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_tweet_sentiment_portuguese_trimmed_portuguese_15000` is a English model originally trained by vocabtrimmer.

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_tweet_sentiment_portuguese_trimmed_portuguese_15000_en_5.5.0_3.0_1725410232175.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_tweet_sentiment_portuguese_trimmed_portuguese_15000_en_5.5.0_3.0_1725410232175.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python

documentAssembler = DocumentAssembler() \
.setInputCol('text') \
.setOutputCol('document')

tokenizer = Tokenizer() \
.setInputCols(['document']) \
.setOutputCol('token')

sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlm_roberta_base_tweet_sentiment_portuguese_trimmed_portuguese_15000","en") \
.setInputCols(["documents","token"]) \
.setOutputCol("class")

pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier])
data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text")
pipelineModel = pipeline.fit(data)
pipelineDF = pipelineModel.transform(data)

```
```scala

val documentAssembler = new DocumentAssembler()
.setInputCols("text")
.setOutputCols("document")

val tokenizer = new Tokenizer()
.setInputCols(Array("document"))
.setOutputCol("token")

val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlm_roberta_base_tweet_sentiment_portuguese_trimmed_portuguese_15000", "en")
.setInputCols(Array("documents","token"))
.setOutputCol("class")

val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier))
val data = Seq("I love spark-nlp").toDS.toDF("text")
val pipelineModel = pipeline.fit(data)
val pipelineDF = pipelineModel.transform(data)

```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|xlm_roberta_base_tweet_sentiment_portuguese_trimmed_portuguese_15000|
|Compatibility:|Spark NLP 5.5.0+|
|License:|Open Source|
|Edition:|Official|
|Input Labels:|[document, token]|
|Output Labels:|[class]|
|Language:|en|
|Size:|358.2 MB|

## References

https://huggingface.co/vocabtrimmer/xlm-roberta-base-tweet-sentiment-pt-trimmed-pt-15000
94 changes: 94 additions & 0 deletions docs/_posts/ahmedlone127/2024-09-05-bert_finetuned_ner_july_en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
---
layout: model
title: English bert_finetuned_ner_july DistilBertForTokenClassification from Amhyr
author: John Snow Labs
name: bert_finetuned_ner_july
date: 2024-09-05
tags: [en, open_source, onnx, token_classification, distilbert, ner]
task: Named Entity Recognition
language: en
edition: Spark NLP 5.5.0
spark_version: 3.0
supported: true
engine: onnx
annotator: DistilBertForTokenClassification
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_july` is a English model originally trained by Amhyr.

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_july_en_5.5.0_3.0_1725506530169.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_july_en_5.5.0_3.0_1725506530169.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python

documentAssembler = DocumentAssembler() \
.setInputCol('text') \
.setOutputCol('document')

tokenizer = Tokenizer() \
.setInputCols(['document']) \
.setOutputCol('token')

tokenClassifier = DistilBertForTokenClassification.pretrained("bert_finetuned_ner_july","en") \
.setInputCols(["documents","token"]) \
.setOutputCol("ner")

pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier])
data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text")
pipelineModel = pipeline.fit(data)
pipelineDF = pipelineModel.transform(data)

```
```scala

val documentAssembler = new DocumentAssembler()
.setInputCols("text")
.setOutputCols("document")

val tokenizer = new Tokenizer()
.setInputCols("document")
.setOutputCol("token")

val tokenClassifier = DistilBertForTokenClassification.pretrained("bert_finetuned_ner_july", "en")
.setInputCols(Array("documents","token"))
.setOutputCol("ner")

val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier))
val data = Seq("I love spark-nlp").toDS.toDF("text")
val pipelineModel = pipeline.fit(data)
val pipelineDF = pipelineModel.transform(data)

```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|bert_finetuned_ner_july|
|Compatibility:|Spark NLP 5.5.0+|
|License:|Open Source|
|Edition:|Official|
|Input Labels:|[document, token]|
|Output Labels:|[ner]|
|Language:|en|
|Size:|505.4 MB|

## References

https://huggingface.co/Amhyr/bert-finetuned-ner_july
94 changes: 94 additions & 0 deletions docs/_posts/ahmedlone127/2024-09-05-darijabert_ar.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
---
layout: model
title: Arabic darijabert BertEmbeddings from SI2M-Lab
author: John Snow Labs
name: darijabert
date: 2024-09-05
tags: [ar, open_source, onnx, embeddings, bert]
task: Embeddings
language: ar
edition: Spark NLP 5.5.0
spark_version: 3.0
supported: true
engine: onnx
annotator: BertEmbeddings
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`darijabert` is a Arabic model originally trained by SI2M-Lab.

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/darijabert_ar_5.5.0_3.0_1725520088704.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/darijabert_ar_5.5.0_3.0_1725520088704.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python

documentAssembler = DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("document")

tokenizer = Tokenizer() \
.setInputCols("document") \
.setOutputCol("token")

embeddings = BertEmbeddings.pretrained("darijabert","ar") \
.setInputCols(["document", "token"]) \
.setOutputCol("embeddings")

pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings])
data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text")
pipelineModel = pipeline.fit(data)
pipelineDF = pipelineModel.transform(data)

```
```scala

val documentAssembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")

val tokenizer = new Tokenizer()
.setInputCols(Array("document"))
.setOutputCol("token")

val embeddings = BertEmbeddings.pretrained("darijabert","ar")
.setInputCols(Array("document", "token"))
.setOutputCol("embeddings")

val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings))
val data = Seq("I love spark-nlp").toDF("text")
val pipelineModel = pipeline.fit(data)
val pipelineDF = pipelineModel.transform(data)

```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|darijabert|
|Compatibility:|Spark NLP 5.5.0+|
|License:|Open Source|
|Edition:|Official|
|Input Labels:|[document, token]|
|Output Labels:|[bert]|
|Language:|ar|
|Size:|551.5 MB|

## References

https://huggingface.co/SI2M-Lab/DarijaBERT
Loading

0 comments on commit 884d929

Please sign in to comment.