Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2023-06-27-roberta_embeddings_robertinh_gl #13868

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
b1fceb9
Add model 2023-06-27-roberta_embeddings_robertinh_gl
ahmedlone127 Jun 27, 2023
57377c6
Add model 2023-06-27-roberta_embeddings_roberta_base_wechsel_german_de
ahmedlone127 Jun 27, 2023
d4a8b9f
Add model 2023-06-27-roberta_embeddings_roberta_base_russian_v0_ru
ahmedlone127 Jun 27, 2023
d84c8f9
Add model 2023-06-27-roberta_embeddings_ruperta_base_finetuned_spa_co…
ahmedlone127 Jun 27, 2023
b0b52e5
Add model 2023-06-27-roberta_embeddings_robasqu_eu
ahmedlone127 Jun 27, 2023
99370d4
Add model 2023-06-27-roberta_embeddings_roberta_ko_small_ko
ahmedlone127 Jun 27, 2023
9863a34
Add model 2023-06-27-roberta_embeddings_hindi_hi
ahmedlone127 Jun 27, 2023
76beef7
Add model 2023-06-27-roberta_embeddings_sundanese_roberta_base_su
ahmedlone127 Jun 27, 2023
4b2f38f
Add model 2023-06-27-roberta_embeddings_roberta_pubmed_en
ahmedlone127 Jun 27, 2023
d848652
Add model 2023-06-27-roberta_embeddings_distilroberta_base_climate_f_en
ahmedlone127 Jun 27, 2023
98b9254
Add model 2023-06-27-roberta_embeddings_roberta_urdu_small_ur
ahmedlone127 Jun 27, 2023
8262ee3
Add model 2023-06-27-roberta_embeddings_BR_BERTo_pt
ahmedlone127 Jun 27, 2023
8287e6e
Add model 2023-06-27-roberta_embeddings_distilroberta_base_climate_d_…
ahmedlone127 Jun 27, 2023
cb3a03d
Add model 2023-06-27-roberta_embeddings_distilroberta_base_climate_d_en
ahmedlone127 Jun 27, 2023
bc1251b
Add model 2023-06-27-roberta_embeddings_ukr_roberta_base_uk
ahmedlone127 Jun 27, 2023
abefd42
Add model 2023-06-27-roberta_embeddings_roberta_base_wechsel_french_fr
ahmedlone127 Jun 27, 2023
d6a6105
Add model 2023-06-27-roberta_embeddings_Bible_roberta_base_en
ahmedlone127 Jun 27, 2023
69949aa
Add model 2023-06-27-roberta_embeddings_bertin_roberta_large_spanish_es
ahmedlone127 Jun 27, 2023
f06879a
Add model 2023-06-27-roberta_embeddings_roberta_base_wechsel_chinese_zh
ahmedlone127 Jun 27, 2023
d00e86a
Add model 2023-06-27-roberta_embeddings_bertin_roberta_base_spanish_es
ahmedlone127 Jun 27, 2023
516a1ec
Add model 2023-06-27-roberta_embeddings_bertin_base_gaussian_es
ahmedlone127 Jun 27, 2023
d08112d
Add model 2023-06-27-roberta_embeddings_bertin_base_random_exp_512seq…
ahmedlone127 Jun 27, 2023
573b8be
Add model 2023-06-27-roberta_embeddings_RuPERTa_base_es
ahmedlone127 Jun 27, 2023
8c8832e
Add model 2023-06-27-roberta_embeddings_roberta_base_bne_es
ahmedlone127 Jun 27, 2023
a849f7e
Add model 2023-06-27-roberta_embeddings_bertin_base_stepwise_exp_512s…
ahmedlone127 Jun 27, 2023
c335692
Add model 2023-06-27-roberta_embeddings_MedRoBERTa.nl_nl
ahmedlone127 Jun 27, 2023
90ced9a
Add model 2023-06-27-roberta_embeddings_bertin_base_random_es
ahmedlone127 Jun 27, 2023
d3b7377
Add model 2023-06-27-roberta_embeddings_RoBERTalex_es
ahmedlone127 Jun 27, 2023
ef84e86
Add model 2023-06-27-roberta_embeddings_SecRoBERTa_en
ahmedlone127 Jun 27, 2023
c3d85ea
Add model 2023-06-27-roberta_embeddings_KanBERTo_kn
ahmedlone127 Jun 27, 2023
c9caac8
Add model 2023-06-27-roberta_embeddings_distilroberta_base_finetuned_…
ahmedlone127 Jun 27, 2023
da853eb
Add model 2023-06-27-roberta_embeddings_MedRoBERTa.nl_nl
ahmedlone127 Jun 27, 2023
38692a9
Add model 2023-06-27-roberta_embeddings_distilroberta_base_finetuned_…
ahmedlone127 Jun 27, 2023
c0fc5e3
Add model 2023-06-27-roberta_embeddings_bertin_base_stepwise_es
ahmedlone127 Jun 27, 2023
a843a54
Add model 2023-06-27-roberta_embeddings_KanBERTo_kn
ahmedlone127 Jun 27, 2023
6d48d87
Add model 2023-06-27-roberta_embeddings_bertin_base_gaussian_exp_512s…
ahmedlone127 Jun 27, 2023
1d8c4ff
Add model 2023-06-27-roberta_embeddings_mlm_spanish_roberta_base_es
ahmedlone127 Jun 27, 2023
3fde011
Add model 2023-06-27-roberta_embeddings_KNUBert_kn
ahmedlone127 Jun 27, 2023
021fb06
Add model 2023-06-27-roberta_embeddings_javanese_roberta_small_jv
ahmedlone127 Jun 27, 2023
a1af26a
Add model 2023-06-27-roberta_embeddings_indonesian_roberta_base_id
ahmedlone127 Jun 27, 2023
ebc912c
Add model 2023-06-27-roberta_embeddings_indic_transformers_hi_roberta_hi
ahmedlone127 Jun 27, 2023
67d99ed
Add model 2023-06-27-roberta_embeddings_indo_roberta_small_id
ahmedlone127 Jun 27, 2023
05dd0dd
Add model 2023-06-27-roberta_embeddings_fairlex_scotus_minilm_en
ahmedlone127 Jun 27, 2023
26d6e42
Add model 2023-06-27-roberta_embeddings_indic_transformers_te_roberta_te
ahmedlone127 Jun 27, 2023
94aa08f
Add model 2023-06-27-roberta_embeddings_javanese_roberta_small_imdb_jv
ahmedlone127 Jun 27, 2023
0d17126
Add model 2023-06-27-roberta_embeddings_jurisbert_es
ahmedlone127 Jun 27, 2023
60f70e9
Add model 2023-06-27-roberta_embeddings_roberta_base_indonesian_522M_id
ahmedlone127 Jun 27, 2023
3c7b53f
Add model 2023-06-27-roberta_embeddings_fairlex_ecthr_minilm_en
ahmedlone127 Jun 27, 2023
a273090
Add model 2023-06-27-roberta_embeddings_muppet_roberta_base_en
ahmedlone127 Jun 27, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
149 changes: 149 additions & 0 deletions docs/_posts/ahmedlone127/2023-06-27-roberta_embeddings_BR_BERTo_pt.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
---
layout: model
title: Portuguese RoBERTa Embeddings (from rdenadai)
author: John Snow Labs
name: roberta_embeddings_BR_BERTo
date: 2023-06-27
tags: [roberta, embeddings, pt, open_source, onnx]
task: Embeddings
language: pt
edition: Spark NLP 5.0.0
spark_version: 3.0
supported: true
engine: onnx
annotator: RoBertaEmbeddings
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained RoBERTa Embeddings model, uploaded to Hugging Face, adapted and imported into Spark NLP. `BR_BERTo` is a Portuguese model orginally trained by `rdenadai`.

## Predicted Entities



{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_embeddings_BR_BERTo_pt_5.0.0_3.0_1687869764918.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_embeddings_BR_BERTo_pt_5.0.0_3.0_1687869764918.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use

<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python
documentAssembler = DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("document")

tokenizer = Tokenizer() \
.setInputCols("document") \
.setOutputCol("token")

embeddings = RoBertaEmbeddings.pretrained("roberta_embeddings_BR_BERTo","pt") \
.setInputCols(["document", "token"]) \
.setOutputCol("embeddings")

pipeline = Pipeline(stages=[documentAssembler, tokenizer, embeddings])

data = spark.createDataFrame([["Eu amo Spark NLP"]]).toDF("text")

result = pipeline.fit(data).transform(data)
```
```scala
val documentAssembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")

val tokenizer = new Tokenizer()
.setInputCols(Array("document"))
.setOutputCol("token")

val embeddings = RoBertaEmbeddings.pretrained("roberta_embeddings_BR_BERTo","pt")
.setInputCols(Array("document", "token"))
.setOutputCol("embeddings")

val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings))

val data = Seq("Eu amo Spark NLP").toDF("text")

val result = pipeline.fit(data).transform(data)
```


{:.nlu-block}
```python
import nlu
nlu.load("pt.embed.BR_BERTo").predict("""Eu amo Spark NLP""")
```

</div>

{:.model-param}

<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python
documentAssembler = DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("document")

tokenizer = Tokenizer() \
.setInputCols("document") \
.setOutputCol("token")

embeddings = RoBertaEmbeddings.pretrained("roberta_embeddings_BR_BERTo","pt") \
.setInputCols(["document", "token"]) \
.setOutputCol("embeddings")

pipeline = Pipeline(stages=[documentAssembler, tokenizer, embeddings])

data = spark.createDataFrame([["Eu amo Spark NLP"]]).toDF("text")

result = pipeline.fit(data).transform(data)
```
```scala
val documentAssembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")

val tokenizer = new Tokenizer()
.setInputCols(Array("document"))
.setOutputCol("token")

val embeddings = RoBertaEmbeddings.pretrained("roberta_embeddings_BR_BERTo","pt")
.setInputCols(Array("document", "token"))
.setOutputCol("embeddings")

val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings))

val data = Seq("Eu amo Spark NLP").toDF("text")

val result = pipeline.fit(data).transform(data)
```

{:.nlu-block}
```python
import nlu
nlu.load("pt.embed.BR_BERTo").predict("""Eu amo Spark NLP""")
```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|roberta_embeddings_BR_BERTo|
|Compatibility:|Spark NLP 5.0.0+|
|License:|Open Source|
|Edition:|Official|
|Input Labels:|[sentence, token]|
|Output Labels:|[bert]|
|Language:|pt|
|Size:|634.5 MB|
|Case sensitive:|true|
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
---
layout: model
title: English RoBERTa Embeddings (from abhi1nandy2)
author: John Snow Labs
name: roberta_embeddings_Bible_roberta_base
date: 2023-06-27
tags: [roberta, embeddings, en, open_source, onnx]
task: Embeddings
language: en
edition: Spark NLP 5.0.0
spark_version: 3.0
supported: true
engine: onnx
annotator: RoBertaEmbeddings
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained RoBERTa Embeddings model, uploaded to Hugging Face, adapted and imported into Spark NLP. `Bible-roberta-base` is a English model orginally trained by `abhi1nandy2`.

## Predicted Entities



{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_embeddings_Bible_roberta_base_en_5.0.0_3.0_1687870518003.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_embeddings_Bible_roberta_base_en_5.0.0_3.0_1687870518003.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use

<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python
documentAssembler = DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("document")

tokenizer = Tokenizer() \
.setInputCols("document") \
.setOutputCol("token")

embeddings = RoBertaEmbeddings.pretrained("roberta_embeddings_Bible_roberta_base","en") \
.setInputCols(["document", "token"]) \
.setOutputCol("embeddings")

pipeline = Pipeline(stages=[documentAssembler, tokenizer, embeddings])

data = spark.createDataFrame([["I love Spark NLP"]]).toDF("text")

result = pipeline.fit(data).transform(data)
```
```scala
val documentAssembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")

val tokenizer = new Tokenizer()
.setInputCols(Array("document"))
.setOutputCol("token")

val embeddings = RoBertaEmbeddings.pretrained("roberta_embeddings_Bible_roberta_base","en")
.setInputCols(Array("document", "token"))
.setOutputCol("embeddings")

val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings))

val data = Seq("I love Spark NLP").toDF("text")

val result = pipeline.fit(data).transform(data)
```


{:.nlu-block}
```python
import nlu
nlu.load("en.embed.Bible_roberta_base").predict("""I love Spark NLP""")
```

</div>

{:.model-param}

<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python
documentAssembler = DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("document")

tokenizer = Tokenizer() \
.setInputCols("document") \
.setOutputCol("token")

embeddings = RoBertaEmbeddings.pretrained("roberta_embeddings_Bible_roberta_base","en") \
.setInputCols(["document", "token"]) \
.setOutputCol("embeddings")

pipeline = Pipeline(stages=[documentAssembler, tokenizer, embeddings])

data = spark.createDataFrame([["I love Spark NLP"]]).toDF("text")

result = pipeline.fit(data).transform(data)
```
```scala
val documentAssembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")

val tokenizer = new Tokenizer()
.setInputCols(Array("document"))
.setOutputCol("token")

val embeddings = RoBertaEmbeddings.pretrained("roberta_embeddings_Bible_roberta_base","en")
.setInputCols(Array("document", "token"))
.setOutputCol("embeddings")

val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings))

val data = Seq("I love Spark NLP").toDF("text")

val result = pipeline.fit(data).transform(data)
```

{:.nlu-block}
```python
import nlu
nlu.load("en.embed.Bible_roberta_base").predict("""I love Spark NLP""")
```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|roberta_embeddings_Bible_roberta_base|
|Compatibility:|Spark NLP 5.0.0+|
|License:|Open Source|
|Edition:|Official|
|Input Labels:|[sentence, token]|
|Output Labels:|[bert]|
|Language:|en|
|Size:|465.9 MB|
|Case sensitive:|true|
Loading