Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Models hub #14458

Merged
merged 145 commits into from
Nov 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
145 commits
Select commit Hold shift + click to select a range
57d855e
Merge branch 'master' into models_hub
maziyarpanahi Nov 21, 2022
41cda2d
Merge branch 'models_hub' of https://github.com/JohnSnowLabs/spark-nl…
maziyarpanahi Nov 25, 2022
6c39602
Merge branch 'master' into models_hub
maziyarpanahi Dec 15, 2022
bed4adb
Merge branch 'master' into models_hub
maziyarpanahi Dec 21, 2022
cf0b08f
Merge branch 'master' into models_hub
maziyarpanahi Feb 7, 2023
93d6753
Merge branch 'master' into models_hub
maziyarpanahi Mar 14, 2023
afb700e
Add model 2023-04-13-CyberbullyingDetection_ClassifierDL_tfhub_en (#1…
jsl-models Apr 13, 2023
bb9a155
2023-04-20-distilbert_base_uncased_mnli_en (#13761)
jsl-models Apr 20, 2023
ea0ba05
2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_multinl…
jsl-models Apr 21, 2023
9afffb1
2023-05-04-roberta_base_zero_shot_classifier_nli_en (#13781)
jsl-models May 4, 2023
f4356e5
2023-05-09-distilbart_xsum_6_6_en (#13788)
jsl-models May 10, 2023
04149fb
Merge branch 'master' into models_hub
maziyarpanahi May 10, 2023
de3e19e
2023-05-11-distilbart_cnn_12_6_en (#13795)
jsl-models May 11, 2023
71de0f7
2023-05-19-match_pattern_en (#13805)
jsl-models May 21, 2023
f28ea8e
2023-05-22-explain_document_md_fr (#13811)
jsl-models May 23, 2023
4049881
2023-05-24-explain_document_md_fr (#13821)
jsl-models May 25, 2023
e4e465e
Add model 2023-05-25-explain_document_md_fr (#13827)
jsl-models May 25, 2023
e8e01a5
2023-05-25-dependency_parse_en (#13828)
jsl-models May 26, 2023
9c0a24e
Merge branch 'master' into models_hub
maziyarpanahi May 26, 2023
2fd64c3
2023-05-25-distilcamembert_french_legal_fr (#13826)
jsl-models May 26, 2023
795ebf8
Update title for 2023-05-25-distilcamembert_french_legal_fr.md (#13831)
Mary-Sci May 26, 2023
c04ca51
2023-05-27-explain_document_md_fr (#13836)
jsl-models May 27, 2023
4d64d1b
2023-05-28-longformer_base_english_legal_en (#13838)
jsl-models May 28, 2023
02a9afb
2023-05-28-xlm_longformer_base_english_legal_en (#13839)
jsl-models May 29, 2023
d054074
2023-06-21-bert_embeddings_distil_clinical_en (#13861)
jsl-models Jun 21, 2023
43ab794
2023-06-26-distilbert_embeddings_finetuned_sarcasm_classification_en …
jsl-models Jun 26, 2023
7cde44f
2023-06-27-roberta_embeddings_robertinh_gl (#13868)
jsl-models Jun 27, 2023
ced98b6
Add model 2023-06-29-xlmroberta_embeddings_paraphrase_mpnet_base_v2_x…
jsl-models Jun 30, 2023
dfaabd4
2023-06-08-instructor_base_en (#13850)
jsl-models Jul 1, 2023
59113cd
2023-06-28-roberta_base_en (#13871)
jsl-models Jul 1, 2023
740f4fb
Merge branch 'master' into models_hub
maziyarpanahi Jul 3, 2023
c999bd6
Merge branch 'master' into models_hub
maziyarpanahi Jul 4, 2023
27840ed
Add model 2023-07-05-image_classifier_convnext_tiny_224_local_en (#13…
jsl-models Jul 5, 2023
566b6ee
Add model 2023-07-06-quora_distilbert_multilingual_en (#13882)
jsl-models Jul 18, 2023
d246455
removed duplicated sections (#13885)
ahmedlone127 Jul 18, 2023
182bc05
Add model 2023-07-20-xlm_roberta_large_zero_shot_classifier_xnli_anli…
jsl-models Jul 21, 2023
9a1bea5
Add model 2023-07-28-twitter_xlm_roberta_base_sentiment_en (#13905)
jsl-models Jul 28, 2023
cc00383
2023-07-30-albert_embeddings_ALR_BERT_ro (#13910)
jsl-models Aug 2, 2023
b6d3cf1
2023-07-28-twitter_xlm_roberta_base_sentiment_en (#13906)
jsl-models Aug 2, 2023
0504fb7
2023-08-07-bart_large_zero_shot_classifier_mnli_en (#13917)
jsl-models Aug 7, 2023
1a0f376
2023-08-15-gte_base_en (#13922)
jsl-models Aug 15, 2023
0e2bb83
2023-08-15-bge_small_en (#13923)
jsl-models Aug 15, 2023
a11908a
2023-08-18-mpnet_embedding_mpnet_snli_en (#13929)
jsl-models Aug 24, 2023
06f07da
2023-08-22-asr_whisper_tiny_opt_xx (#13931)
jsl-models Aug 24, 2023
b1b99f5
2023-08-25-e5_small_en (#13939)
jsl-models Aug 25, 2023
b891455
2023-08-25-e5_large_v2_opt_en (#13941)
jsl-models Aug 25, 2023
2f27b9a
2023-08-29-mpnet_embedding_tiny_random_mpnet_by_hf_internal_testing_e…
jsl-models Aug 29, 2023
da67ab7
Merge branch 'master' into models_hub
maziyarpanahi Sep 6, 2023
ae1e24f
2023-08-28-asr_whisper_tiny_opt_xx (#13944)
jsl-models Sep 7, 2023
b1da33e
2023-09-07-java_pointer_classifier_en (#13968)
jsl-models Sep 8, 2023
16c83c2
2023-09-09-medium_mlm_imdb_en (#13970)
jsl-models Sep 11, 2023
f3c878e
2023-09-12-tiny_mlm_glue_rte_en (#13975)
jsl-models Sep 13, 2023
6ec2297
2023-09-13-tlm_ag_small_scale_en (#13980)
jsl-models Sep 13, 2023
d6f3fe5
2023-09-13-bert_base_uncased_issues_128_juandeun_en (#13981)
jsl-models Sep 14, 2023
0f86237
2023-09-14-bert_base_cased_finetuned_wallisian_manual_9ep_en (#13982)
jsl-models Sep 15, 2023
d9fdbf0
2023-09-15-distilbert_base_german_cased_de (#13984)
jsl-models Sep 16, 2023
6ee008b
2023-09-18-m3_experiment_albert_base_v2_tweet_eval_hate_word_swapping…
jsl-models Sep 20, 2023
577ecb6
Add model 2023-09-18-AtgxRobertaBaseSquad2_en (#13988)
jsl-models Sep 25, 2023
d4bd550
2023-09-20-image_captioning_vit_gpt2_en (#13999)
jsl-models Sep 25, 2023
00e0a8d
2023-09-21-multilingual_e5_base_xx (#14002)
jsl-models Sep 25, 2023
2fd633a
2023-09-22-bert_embeddings_frpile_gpl_en (#14003)
jsl-models Sep 25, 2023
dfd063d
2023-10-17-asr_whisper_kannada_base_kn (#14030)
jsl-models Oct 18, 2023
a6cb9d4
2023-10-19-asr_whisper_small_urdu_1000_64_1e_05_pretrain_arabic_en (#…
jsl-models Oct 20, 2023
b505d77
2023-10-24-bert_cn_finetuning_18811449050_en (#14039)
jsl-models Oct 25, 2023
824a221
2023-10-25-bert_ft_qqp_79_jeevesh8_en (#14040)
jsl-models Oct 25, 2023
7066a17
2023-10-25-bert_classifier_finbert_esg_en (#14041)
jsl-models Oct 26, 2023
724d1c2
Merge branch 'master' into models_hub
maziyarpanahi Oct 26, 2023
9b279f7
remove model with bad casing
maziyarpanahi Oct 26, 2023
ad57fee
2023-10-26-mengzi_bert_base_fin_wallstreetcn_morning_news_market_over…
jsl-models Oct 27, 2023
b5f5d92
2023-10-27-bert_classifier_tiny_sst2_1_mobile_2_distillation_en (#14046)
jsl-models Nov 1, 2023
4e58179
Merge branch 'master' into models_hub
maziyarpanahi Nov 4, 2023
471317f
2023-11-01-negbleurt_en (#14050)
jsl-models Nov 4, 2023
8cdd6cc
2023-11-06-bert_ner_alexander_learn_bert_finetuned_ner_accelerate_en …
jsl-models Nov 6, 2023
ab94420
2023-11-06-bert_ner_biobert_ner_bc2gm_corpus_en (#14055)
jsl-models Nov 8, 2023
ca7234a
2023-11-08-scibert_finetuned_ner_fl_en (#14058)
jsl-models Nov 10, 2023
1f5115e
2023-11-12-bert_qa_base_cased_squad2_en (#14065)
jsl-models Nov 13, 2023
5317ba7
2023-11-14-bert_qa_base_parsbert_uncased_finetuned_squad_fa (#14068)
jsl-models Nov 15, 2023
b94cf6f
Merge branch 'master' into models_hub
maziyarpanahi Nov 16, 2023
9b41b70
2023-11-15-bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_2_…
jsl-models Nov 17, 2023
3d8335c
2023-11-17-mbert_quoref_en (#14071)
jsl-models Nov 17, 2023
5c5433d
2023-11-18-distilbert_sequence_classifier_autonlp_tweet_sentiment_ext…
jsl-models Nov 20, 2023
c6f8db4
2023-11-20-distilled_indobert_classification_en (#14074)
jsl-models Nov 22, 2023
7d26933
2023-11-26-distilbert_base_cased_qa_squad2_en (#14077)
jsl-models Nov 28, 2023
8f8fb2c
2023-11-29-roberta_classifier_autonlp_covid_432211280_en (#14080)
jsl-models Dec 7, 2023
058b5f5
2023-12-02-zero_shot_classifier_clip_vit_base_patch32_en (#14082)
jsl-models Dec 8, 2023
6c15fc8
2023-12-13-roberta_ner_roberta_base_biomedical_clinical_spanish_finet…
jsl-models Dec 19, 2023
6e8bb56
2023-12-18-sentiment_analysis_distillbert_base_uncased_model_en (#14102)
jsl-models Dec 24, 2023
cf17306
2023-12-24-roberta_base_wechsel_german_finetuned_germanquad_en (#14108)
jsl-models Dec 27, 2023
2157ace
Add model 2023-12-22-Affiliation_Classifier_Roberta_en (#14106)
jsl-models Dec 27, 2023
9fa1404
2023-12-29-finetuning_sentiment_model_3000_samples_zhaohui_en (#14115)
jsl-models Jan 1, 2024
f38ea17
2024-01-01-bge_small_en (#14116)
jsl-models Jan 1, 2024
2551e9d
2024-01-01-bert_model_12_class_en (#14119)
jsl-models Jan 18, 2024
f242148
Add model 2024-01-10-mpnet_sequence_classifier_ukr_message_en (#14131)
jsl-models Jan 18, 2024
6319810
Merge branch 'master' into models_hub
maziyarpanahi Jan 18, 2024
70ffc23
2024-01-19-deberta_base_zero_shot_classifier_mnli_anli_v3_en (#14144)
jsl-models Jan 22, 2024
4b8ea57
Add model 2024-02-01-bert_zero_shot_classifier_mnli_xx (#14157)
jsl-models Feb 1, 2024
34b675c
Add model 2024-01-20-mpnet_base_question_answering_squad2_en (#14146)
jsl-models Feb 6, 2024
3a56387
2024-02-11-bge_m3_xx (#14170)
jsl-models Feb 11, 2024
15085ce
2024-02-16-distil_asr_whisper_small_en (#14176)
jsl-models Feb 26, 2024
76518b3
2024-04-04-mpnet_embeddings_biolord_2023_c_en (#14226)
jsl-models Apr 5, 2024
0b04051
Add model 2024-04-05-uae_large_v1_en (#14229)
jsl-models Apr 5, 2024
32e77b1
Add model 2024-04-22-mpnet_embeddings_biolord_2023_en (#14240)
jsl-models Apr 22, 2024
ef4b836
Add model 2024-04-22-mpnet_embeddings_biolord_2023_en (#14241)
jsl-models Apr 23, 2024
84f9993
2024-05-06-deepa_xlmroberta_ner_large_en_panx_en (#14246)
jsl-models May 7, 2024
a1aa1fc
2024-06-10-test25_en (#14326)
jsl-models Jun 12, 2024
c2efa8c
2024-06-13-bge_base_english_sec10k_embed_en (#14331)
jsl-models Jun 21, 2024
f0bb625
Merge branch 'master' into models_hub
maziyarpanahi Jul 1, 2024
04d9735
2024-07-01-mpnet_base_token_classifier_en (#14336)
jsl-models Jul 4, 2024
8fe1748
2024-07-05-phi2_7b_en (#14339)
jsl-models Jul 12, 2024
1080c44
2024-07-16-snowflake_artic_m_en (#14352)
jsl-models Aug 12, 2024
b7047c6
Merge branch 'master' into models_hub
maziyarpanahi Aug 15, 2024
962d9e0
2024-08-12-flan_t5_base_tweet_hate_en (#14366)
jsl-models Aug 15, 2024
ffb3b2a
2024-08-16-deberta_base_zero_shot_classifier_mnli_anli_v3_en (#14369)
jsl-models Aug 17, 2024
04ebc29
2024-08-17-long_t5lephone_5000_en (#14371)
jsl-models Aug 20, 2024
32d8591
2024-08-20-legal_t5_small_trans_swedish_czech_small_finetuned_en (#14…
jsl-models Aug 23, 2024
784ce75
2024-08-23-long_t5_local_base_alex2awesome_pipeline_en (#14377)
jsl-models Aug 27, 2024
851cd19
2024-08-27-t5_base_finetuned_cnn_dailymail_en (#14380)
jsl-models Aug 29, 2024
23dc7e3
2024-08-30-camembert_base_qa_fquad_fr (#14386)
jsl-models Sep 1, 2024
67e256a
2024-09-01-trained_polish_en (#14387)
jsl-models Sep 3, 2024
ab101a6
2024-09-03-xlmroberta_ner_base_indonesian_pipeline_id (#14391)
jsl-models Sep 5, 2024
3892863
2024-09-05-sent_arbertv2_ar (#14394)
jsl-models Sep 8, 2024
2df9cfd
2024-09-08-distilbert_base_uncased_finetuned_imdb_chrischang80_en (#1…
jsl-models Sep 9, 2024
6eb7df9
2024-09-09-distilbert_extractive_qa_large_project_pipeline_en (#14397)
jsl-models Sep 10, 2024
69fae36
2024-09-07-fine_tuned_distilbert_pipeline_en (#14398)
jsl-models Sep 11, 2024
df3d300
2024-09-11-opus_maltese_english_german_finetuned_english_tonga_tonga_…
jsl-models Sep 15, 2024
fc72501
2024-09-08-distilbert_base_uncased_finetuned_imdb_chrischang80_pipeli…
jsl-models Sep 16, 2024
bf76942
2024-09-06-xlm_roberta_base_finetuned_panx_german_ahmad_alismail_pipe…
jsl-models Sep 17, 2024
ba5ac12
2024-09-15-roberta_base_epoch_30_pipeline_en (#14402)
jsl-models Sep 18, 2024
2e32685
2024-09-18-roberta_fine_tuned_text_classification_slovene_data_augmen…
jsl-models Sep 19, 2024
fef850d
2024-09-19-ner_chunkyun_pipeline_en (#14404)
jsl-models Sep 20, 2024
884d929
2024-09-18-imdb_0_pipeline_en (#14405)
jsl-models Sep 22, 2024
5c0b554
2024-09-16-finetuning_sentiment_model_3000_samples_sarathaer_en (#14407)
jsl-models Sep 23, 2024
0d3c9a3
Add model 2024-09-23-phi3.5_mini_4k_instruct_q4_gguf_en (#14410)
jsl-models Sep 24, 2024
08b12c6
2024-09-23-bert_finetuned_ner_asos_uncased_pipeline_en (#14409)
jsl-models Sep 24, 2024
8b6cca8
2024-09-23-distilbert_base_uncased_finetuned_cola_garyseventeen_en (#…
jsl-models Sep 25, 2024
69c1a1c
2024-09-25-bert_base_uncased_top_pruned_stsb_pipeline_en (#14415)
jsl-models Sep 26, 2024
efb891c
Merge branch 'master' into models_hub
maziyarpanahi Sep 28, 2024
555e800
2024-09-26-bert_base_uncased_offenseval2019_upsample_en (#14419)
jsl-models Sep 28, 2024
6b5e175
2024-10-21-bge_medembed_small_v0_1_en (#14440)
jsl-models Oct 21, 2024
27f94af
Add model 2024-10-03-blip_vqa_base_en (#14423)
jsl-models Oct 24, 2024
3332695
2024-10-10-gemma_2_2b_it_iq3_m_en (#14432)
jsl-models Oct 30, 2024
d8d4736
2024-10-29-gemma_2_2b_it_iq3_m_en (#14446)
jsl-models Oct 30, 2024
5a556ba
2024-11-01-distilbart_xsum_12_6_en (#14447)
jsl-models Nov 9, 2024
ab69789
2024-11-10-rubert_address_elements_ru (#14452)
jsl-models Nov 11, 2024
30b478f
Add model 2024-11-13-roberta_embeddings_legal_roberta_base_en (#14456)
jsl-models Nov 13, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
101 changes: 101 additions & 0 deletions docs/_posts/Cabir40/2024-10-21-bge_medembed_base_v0_1_en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
---
layout: model
title: English bge_medembed_base_v0_1 BGEEmbeddings from abhinand
author: John Snow Labs
name: bge_medembed_base_v0_1
date: 2024-10-21
tags: [embedding, en, open_source, bge, medical, onnx]
task: Embeddings
language: en
edition: Spark NLP 5.5.0
spark_version: 3.0
supported: true
engine: onnx
annotator: BGEEmbeddings
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained BGEEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.
`bge_medembed_base_v0_1` is a English model originally trained by abhinand

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bge_medembed_base_v0_1_en_5.5.0_3.0_1729515433167.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bge_medembed_base_v0_1_en_5.5.0_3.0_1729515433167.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python

document_assembler = DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("document")

embeddings = BGEEmbeddings.pretrained("bge_medembed_base_v0_1","en")\
.setInputCols(["document"])\
.setOutputCol("embeddings")

pipeline = Pipeline(
stages = [
document_assembler,
embeddings
])

data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text")

result = pipeline.fit(data).transform(data)

```
```scala

val document_assembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")

val embeddings = BGEEmbeddings.pretrained("bge_medembed_base_v0_1","en")
.setInputCols(Array("document"))
.setOutputCol("embeddings")

val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings))

val data = Seq("I love spark-nlp").toDS.toDF("text")

val result = pipeline.fit(data).transform(data)

```
</div>

## Results

```bash

+----------------------------------------------------------------------------------------------------+
| bge_embedding|
+----------------------------------------------------------------------------------------------------+
|[{sentence_embeddings, 0, 15, I love spark-nlp, {sentence -> 0}, [-0.018065551, -0.032784615, 0.0...|
+----------------------------------------------------------------------------------------------------+

```

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|bge_medembed_base_v0_1|
|Compatibility:|Spark NLP 5.5.0+|
|License:|Open Source|
|Edition:|Official|
|Input Labels:|[document]|
|Output Labels:|[bge]|
|Language:|en|
|Size:|389.7 MB|
101 changes: 101 additions & 0 deletions docs/_posts/Cabir40/2024-10-21-bge_medembed_large_v0_1_en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
---
layout: model
title: English bge_medembed_large_v0_1 BGEEmbeddings from abhinand
author: John Snow Labs
name: bge_medembed_large_v0_1
date: 2024-10-21
tags: [embedding, en, open_source, bge, medical, onnx]
task: Embeddings
language: en
edition: Spark NLP 5.5.0
spark_version: 3.0
supported: true
engine: onnx
annotator: BGEEmbeddings
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained BGEEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.
`bge_medembed_large_v0_1` is a English model originally trained by abhinand

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bge_medembed_large_v0_1_en_5.5.0_3.0_1729515260623.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bge_medembed_large_v0_1_en_5.5.0_3.0_1729515260623.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python

document_assembler = DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("document")

embeddings = BGEEmbeddings.pretrained("bge_medembed_large_v0_1","en")\
.setInputCols(["document"])\
.setOutputCol("embeddings")

pipeline = Pipeline(
stages = [
document_assembler,
embeddings
])

data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text")

result = pipeline.fit(data).transform(data)

```
```scala

val document_assembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")

val embeddings = BGEEmbeddings.pretrained("bge_medembed_large_v0_1","en")
.setInputCols(Array("document"))
.setOutputCol("embeddings")

val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings))

val data = Seq("I love spark-nlp").toDS.toDF("text")

val result = pipeline.fit(data).transform(data)

```
</div>

## Results

```bash

+----------------------------------------------------------------------------------------------------+
| bge_embedding|
+----------------------------------------------------------------------------------------------------+
|[{sentence_embeddings, 0, 15, I love spark-nlp, {sentence -> 0}, [-0.018065551, -0.032784615, 0.0...|
+----------------------------------------------------------------------------------------------------+

```

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|bge_medembed_large_v0_1|
|Compatibility:|Spark NLP 5.5.0+|
|License:|Open Source|
|Edition:|Official|
|Input Labels:|[document]|
|Output Labels:|[bge]|
|Language:|en|
|Size:|1.2 GB|
101 changes: 101 additions & 0 deletions docs/_posts/Cabir40/2024-10-21-bge_medembed_small_v0_1_en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
---
layout: model
title: English bge_medembed_small_v0_1 BGEEmbeddings from abhinand
author: John Snow Labs
name: bge_medembed_small_v0_1
date: 2024-10-21
tags: [embedding, en, open_source, bge, medical, onnx]
task: Embeddings
language: en
edition: Spark NLP 5.5.0
spark_version: 3.0
supported: true
engine: onnx
annotator: BGEEmbeddings
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained BGEEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.
`bge_medembed_small_v0_1` is a English model originally trained by abhinand

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bge_medembed_small_v0_1_en_5.5.0_3.0_1729513920928.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bge_medembed_small_v0_1_en_5.5.0_3.0_1729513920928.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python

document_assembler = DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("document")

embeddings = BGEEmbeddings.pretrained("bge_medembed_small_v0_1","en")\
.setInputCols(["document"])\
.setOutputCol("embeddings")

pipeline = Pipeline(
stages = [
document_assembler,
embeddings
])

data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text")

result = pipeline.fit(data).transform(data)

```
```scala

val document_assembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")

val embeddings = BGEEmbeddings.pretrained("bge_medembed_small_v0_1","en")
.setInputCols(Array("document"))
.setOutputCol("embeddings")

val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings))

val data = Seq("I love spark-nlp").toDS.toDF("text")

val result = pipeline.fit(data).transform(data)

```
</div>

## Results

```bash

+----------------------------------------------------------------------------------------------------+
| bge_embedding|
+----------------------------------------------------------------------------------------------------+
|[{sentence_embeddings, 0, 15, I love spark-nlp, {sentence -> 0}, [-0.07673764, -0.04207312, 0.026...|
+----------------------------------------------------------------------------------------------------+

```

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|bge_medembed_small_v0_1|
|Compatibility:|Spark NLP 5.5.0+|
|License:|Open Source|
|Edition:|Official|
|Input Labels:|[document]|
|Output Labels:|[bge]|
|Language:|en|
|Size:|116.4 MB|
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
---
layout: model
title: English dummy_model_jbao8899_pipeline pipeline CamemBertEmbeddings from jbao8899
author: John Snow Labs
name: dummy_model_jbao8899_pipeline
date: 2024-09-09
tags: [en, open_source, pipeline, onnx]
task: Embeddings
language: en
edition: Spark NLP 5.5.0
spark_version: 3.0
supported: true
annotator: PipelineModel
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained CamemBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dummy_model_jbao8899_pipeline` is a English model originally trained by jbao8899.

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dummy_model_jbao8899_pipeline_en_5.5.0_3.0_1725852224773.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dummy_model_jbao8899_pipeline_en_5.5.0_3.0_1725852224773.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python

pipeline = PretrainedPipeline("dummy_model_jbao8899_pipeline", lang = "en")
annotations = pipeline.transform(df)

```
```scala

val pipeline = new PretrainedPipeline("dummy_model_jbao8899_pipeline", lang = "en")
val annotations = pipeline.transform(df)

```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|dummy_model_jbao8899_pipeline|
|Type:|pipeline|
|Compatibility:|Spark NLP 5.5.0+|
|License:|Open Source|
|Edition:|Official|
|Language:|en|
|Size:|264.0 MB|

## References

https://huggingface.co/jbao8899/dummy-model

## Included Models

- DocumentAssembler
- TokenizerModel
- CamemBertEmbeddings
Loading