Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Models hub internal #13509

Merged
merged 66 commits into from
Feb 13, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
fc21e03
Add model 2022-10-19-ner_covid_trials_en
Ahmetemintek Oct 19, 2022
7cd4129
Add model 2022-10-19-ner_jsl_en
Ahmetemintek Oct 19, 2022
1678136
Merge pull request #12953 from JohnSnowLabs/2022-10-19-ner_covid_tria…
vkocaman Oct 20, 2022
808aeca
Add model 2022-10-25-t5_base_pubmedqa_en
HashamUlHaq Oct 25, 2022
5054169
Add model 2022-10-25-ner_oncology_en
mauro-nievoff Oct 25, 2022
29512d1
Add model 2022-10-25-ner_oncology_therapy_en
mauro-nievoff Oct 25, 2022
fa4e6ff
Add model 2022-10-25-ner_oncology_diagnosis_en
mauro-nievoff Oct 25, 2022
4e38f26
Add model 2022-10-25-ner_oncology_tnm_en
mauro-nievoff Oct 25, 2022
c4d5544
Add model 2022-10-25-ner_oncology_anatomy_general_en
mauro-nievoff Oct 25, 2022
fad154d
Add model 2022-10-25-ner_oncology_demographics_en
mauro-nievoff Oct 25, 2022
f19d347
Add model 2022-10-25-ner_oncology_test_en
mauro-nievoff Oct 25, 2022
9e7f69c
Add model 2022-10-25-ner_oncology_unspecific_posology_en
mauro-nievoff Oct 25, 2022
b189864
Add model 2022-10-25-ner_oncology_anatomy_granular_en
mauro-nievoff Oct 25, 2022
bceabd6
Add model 2022-10-25-ner_oncology_response_to_treatment_en
mauro-nievoff Oct 25, 2022
7de9179
Add model 2022-10-25-ner_oncology_biomarker_en
mauro-nievoff Oct 25, 2022
1487652
Add model 2022-10-25-ner_oncology_posology_en
mauro-nievoff Oct 25, 2022
f4e7815
Merge pull request #12984 from JohnSnowLabs/2022-10-25-t5_base_pubmed…
vkocaman Oct 26, 2022
25c01ea
updated bancmark
Cabir40 Oct 26, 2022
4b8785b
Benchmark format updating
mauro-nievoff Oct 26, 2022
5740c9e
Benchmark format updating
mauro-nievoff Oct 26, 2022
dc2aede
Benchmark format updating
mauro-nievoff Oct 26, 2022
751a717
Benchmark format updating
mauro-nievoff Oct 26, 2022
82d68aa
Update 2022-10-25-ner_oncology_anatomy_general_en.md
mauro-nievoff Oct 26, 2022
21057d2
Benchmark format updating
mauro-nievoff Oct 26, 2022
6d54bf8
Benchmark format updating
mauro-nievoff Oct 26, 2022
cae01bb
Benchmark format updating
mauro-nievoff Oct 26, 2022
7df2fd9
Benchmark format update
mauro-nievoff Oct 26, 2022
f675d8d
Benchmark format update
mauro-nievoff Oct 26, 2022
8739091
Benchmark format update
mauro-nievoff Oct 26, 2022
1590752
Benchmark format update
mauro-nievoff Oct 26, 2022
e0efc57
Merge pull request #12989 from JohnSnowLabs/2022-10-25-ner_oncology_e…
vkocaman Oct 27, 2022
27c0cb3
Add model 2022-10-28-sbiobertresolve_icd10pcs_augmented_en
Ahmetemintek Oct 28, 2022
5b0e2f3
Update 2022-10-28-sbiobertresolve_icd10pcs_augmented_en.md
Ahmetemintek Oct 28, 2022
b635676
Merge pull request #13003 from JohnSnowLabs/2022-10-28-sbiobertresolv…
vkocaman Oct 28, 2022
7ad9cda
Add model 2022-10-29-icd10cm_mapper_en
Ahmetemintek Oct 29, 2022
ad9202b
Merge pull request #13004 from JohnSnowLabs/2022-10-29-icd10cm_mapper…
vkocaman Oct 30, 2022
4964ecf
2022-10-30-abbreviation_mapper_augmented_en (#13005)
jsl-models Nov 2, 2022
d45a541
Add model 2022-11-02-icd10cm_resolver_pipeline_en (#13017)
jsl-models Nov 2, 2022
452d8d3
Add model 2022-11-03-oncology_general_pipeline_en (#13031)
jsl-models Nov 3, 2022
9a009b7
2022-11-04-oncology_diagnosis_pipeline_en (#13038)
jsl-models Nov 10, 2022
ddd470a
2022-11-15-ner_sdoh_slim_wip_en (#13088)
jsl-models Nov 15, 2022
6cc8988
Add model 2022-11-16-abbreviation_category_mapper_en (#13095)
jsl-models Nov 16, 2022
8a3d199
2022-11-18-kegg_disease_mapper_en (#13113)
jsl-models Nov 21, 2022
45b58cd
2022-11-22-ner_deid_generic_bert_ro (#13121)
jsl-models Nov 22, 2022
2e76412
2022-11-24-ner_oncology_anatomy_general_en (#13139)
jsl-models Nov 24, 2022
58d2765
Merge branch 'master' into models_hub_internal
Meryem1425 Dec 2, 2022
9884ae8
2022-12-01-oncology_general_pipeline_en (#13178)
jsl-models Dec 6, 2022
20e6f12
Add model 2022-12-15-drug_category_mapper_en (#13230)
jsl-models Dec 15, 2022
f163725
2022-12-17-ner_sdoh_mentions_en (#13245)
jsl-models Dec 17, 2022
ff8b1a0
2022-12-18-meddroprof_scielowiki_es (#13252)
jsl-models Dec 18, 2022
038f9d5
Add model 2022-12-18-ner_sdoh_mentions_test_en (#13259)
jsl-models Dec 18, 2022
d3d8b7c
Delete 2022-12-15-drug_category_mapper_en.md
Cabir40 Dec 19, 2022
3e83552
Delete 2022-12-17-ner_sdoh_mentions_en.md
Cabir40 Dec 19, 2022
a456f0e
Merge branch 'master' into models_hub_internal
Cabir40 Dec 20, 2022
c5701b5
Add model 2023-01-06-redl_clinical_biobert_en (#13313)
jsl-models Jan 6, 2023
ee76bdb
2023-01-11-ner_oncology_unspecific_posology_healthcare_en (#13328)
jsl-models Jan 12, 2023
bdf8d16
2023-01-14-genericclassifier_sdoh_tobacco_usage_sbiobert_cased_mli_en…
jsl-models Jan 14, 2023
28e2d76
2023-01-14-redl_ade_biobert_en (#13349)
jsl-models Jan 15, 2023
9e2dbd2
2023-01-25-ner_eu_clinical_case_en (#13415)
jsl-models Jan 26, 2023
1515298
2023-02-01-ner_eu_clinical_case_es (#13454)
jsl-models Feb 7, 2023
b483fa4
Add model 2023-02-09-rxnorm_drug_brandname_mapper_en (#13493)
jsl-models Feb 10, 2023
b4bdec6
2023-02-10-ner_sdoh_social_environment_wip_en (#13496)
jsl-models Feb 10, 2023
02ccbb8
Update 2022-11-24-ner_oncology_anatomy_general_en.md
Cabir40 Feb 10, 2023
306dacd
fixed conflict
Feb 10, 2023
a6e7371
Update 2023-01-06-redl_clinical_biobert_en.md
Cabir40 Feb 10, 2023
b02e212
2023-02-11-ner_sdoh_wip_en (#13507)
jsl-models Feb 12, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
148 changes: 148 additions & 0 deletions docs/_posts/Ahmetemintek/2023-02-09-rxnorm_drug_brandname_mapper_en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
---
layout: model
title: Mapping RxNorm and RxNorm Extension Codes with Corresponding Drug Brand Names
author: John Snow Labs
name: rxnorm_drug_brandname_mapper
date: 2023-02-09
tags: [chunk_mappig, rxnorm, drug_brand_name, rxnorm_extension, en, clinical, licensed]
task: Chunk Mapping
language: en
edition: Healthcare NLP 4.3.0
spark_version: 3.0
supported: true
annotator: ChunkMapperModel
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

This pretrained model maps RxNorm and RxNorm Extension codes with their corresponding drug brand names. It returns 2 types of brand names for the corresponding RxNorm or RxNorm Extension code.

## Predicted Entities

`rxnorm_brandname`, `rxnorm_extension_brandname`

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
[Open in Colab](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/Certification_Trainings/Healthcare/26.Chunk_Mapping.ipynb){:.button.button-orange.button-orange-trans.co.button-icon}
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/clinical/models/rxnorm_drug_brandname_mapper_en_4.3.0_3.0_1675966478332.zip){:.button.button-orange}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/clinical/models/rxnorm_drug_brandname_mapper_en_4.3.0_3.0_1675966478332.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python
documentAssembler = DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("chunk")

sbert_embedder = BertSentenceEmbeddings\
.pretrained("sbiobert_base_cased_mli", "en","clinical/models")\
.setInputCols(["chunk"])\
.setOutputCol("sbert_embeddings")

rxnorm_resolver = SentenceEntityResolverModel\
.pretrained("sbiobertresolve_rxnorm_augmented", "en", "clinical/models")\
.setInputCols(["chunk", "sbert_embeddings"])\
.setOutputCol("rxnorm_code")\
.setDistanceFunction("EUCLIDEAN")

resolver2chunk = Resolution2Chunk()\
.setInputCols(["rxnorm_code"]) \
.setOutputCol("rxnorm_chunk")\

chunkerMapper = ChunkMapperModel.pretrained("rxnorm_drug_brandname_mapper", "en", "clinical/models")\
.setInputCols(["rxnorm_chunk"])\
.setOutputCol("mappings")\
.setRels(["rxnorm_brandname", "rxnorm_extension_brandname"])


pipeline = Pipeline(
stages = [
documentAssembler,
sbert_embedder,
rxnorm_resolver,
resolver2chunk,
chunkerMapper
])

model = pipeline.fit(spark.createDataFrame([['']]).toDF('text'))

pipeline = LightPipeline(model)

result = pipeline.fullAnnotate(['metformin', 'advil'])

```
```scala
val documentAssembler = new DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("chunk")

val sbert_embedder = BertSentenceEmbeddings\
.pretrained("sbiobert_base_cased_mli", "en","clinical/models")\
.setInputCols(["chunk"])\
.setOutputCol("sbert_embeddings")

val rxnorm_resolver = SentenceEntityResolverModel\
.pretrained("sbiobertresolve_rxnorm_augmented", "en", "clinical/models")\
.setInputCols(["chunk", "sbert_embeddings"])\
.setOutputCol("rxnorm_code")\
.setDistanceFunction("EUCLIDEAN")

val resolver2chunk = new Resolution2Chunk()\
.setInputCols(["rxnorm_code"]) \
.setOutputCol("rxnorm_chunk")\

val chunkerMapper = ChunkMapperModel.pretrained("rxnorm_drug_brandname_mapper", "en", "clinical/models")\
.setInputCols(["rxnorm_chunk"])\
.setOutputCol("mappings")\
.setRels(["rxnorm_brandname", "rxnorm_extension_brandname"])



val pipeline = new Pipeline(stages = Array(
documentAssembler,
sbert_embedder,
rxnorm_resolver,
resolver2chunk
chunkerMapper
))

val data = Seq(Array("metformin", "advil")).toDS.toDF("text")

val result= pipeline.fit(data).transform(data)

```
</div>

## Results

```bash
+--------------+-------------+--------------------------------------------------+--------------------------+
| drug_name|rxnorm_result| mapping_result| relation |
+--------------+-------------+--------------------------------------------------+--------------------------+
| metformin| 6809|Actoplus Met (metformin):::Avandamet (metformin...| rxnorm_brandname|
| metformin| 6809|A FORMIN (metformin):::ABERIN MAX (metformin)::...|rxnorm_extension_brandname|
| advil| 153010| Advil (Advil)| rxnorm_brandname|
| advil| 153010| NONE|rxnorm_extension_brandname|
+--------------+-------------+--------------------------------------------------+--------------------------+
```
{:.model-param}
## Model Information
{:.table-model}
|---|---|
|Model Name:|rxnorm_drug_brandname_mapper|
|Compatibility:|Healthcare NLP 4.3.0+|
|License:|Licensed|
|Edition:|Official|
|Input Labels:|[rxnorm_chunk]|
|Output Labels:|[mappings]|
|Language:|en|
|Size:|4.0 MB|
2 changes: 1 addition & 1 deletion docs/_posts/Cabir40/2023-01-06-redl_clinical_biobert_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -221,4 +221,4 @@ TrIP 0.517 0.796 0.627 151
TrNAP 0.402 0.672 0.503 112
TrWP 0.257 0.824 0.392 109
Avg. 0.635 0.803 0.691 -
```
```
166 changes: 166 additions & 0 deletions docs/_posts/Meryem1425/2023-02-10-ner_sdoh_demographics_wip_en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,166 @@
---
layout: model
title: Extract Demographic Entities from Social Determinants of Health Texts
author: John Snow Labs
name: ner_sdoh_demographics_wip
date: 2023-02-10
tags: [licensed, clinical, social_determinants, en, ner, demographics, sdoh, public_health]
task: Named Entity Recognition
language: en
edition: Healthcare NLP 4.2.8
spark_version: 3.0
supported: true
annotator: MedicalNerModel
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

This model extracts demographic information related to Social Determinants of Health from various kinds of biomedical documents.

## Predicted Entities

`Family_Member`, `Age`, `Gender`, `Geographic_Entity`, `Race_Ethnicity`, `Language`, `Spiritual_Beliefs`

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/clinical/models/ner_sdoh_demographics_wip_en_4.2.8_3.0_1675998706136.zip){:.button.button-orange}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/clinical/models/ner_sdoh_demographics_wip_en_4.2.8_3.0_1675998706136.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}

```python
document_assembler = DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("document")

sentence_detector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "en")\
.setInputCols(["document"])\
.setOutputCol("sentence")

tokenizer = Tokenizer()\
.setInputCols(["sentence"])\
.setOutputCol("token")

clinical_embeddings = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models")\
.setInputCols(["sentence", "token"])\
.setOutputCol("embeddings")

ner_model = MedicalNerModel.pretrained("ner_sdoh_demographics_wip", "en", "clinical/models")\
.setInputCols(["sentence", "token", "embeddings"])\
.setOutputCol("ner")

ner_converter = NerConverterInternal()\
.setInputCols(["sentence", "token", "ner"])\
.setOutputCol("ner_chunk")

pipeline = Pipeline(stages=[
document_assembler,
sentence_detector,
tokenizer,
clinical_embeddings,
ner_model,
ner_converter
])

sample_texts = ["SOCIAL HISTORY: He is a former tailor from Korea.",
"He lives alone,single and no children.",
"Pt is a 61 years old married, Caucasian, Catholic woman. Pt speaks English reasonably well."]


data = spark.createDataFrame(sample_texts, StringType()).toDF("text")

result = pipeline.fit(data).transform(data)
```
```scala
val document_assembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")

val sentence_detector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "en")
.setInputCols("document")
.setOutputCol("sentence")

val tokenizer = new Tokenizer()
.setInputCols("sentence")
.setOutputCol("token")

val clinical_embeddings = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models")
.setInputCols(Array("sentence", "token"))
.setOutputCol("embeddings")

val ner_model = MedicalNerModel.pretrained("ner_sdoh_demographics_wip", "en", "clinical/models")
.setInputCols(Array("sentence", "token", "embeddings"))
.setOutputCol("ner")

val ner_converter = new NerConverterInternal()
.setInputCols(Array("sentence", "token", "ner"))
.setOutputCol("ner_chunk")

val pipeline = new Pipeline().setStages(Array(
document_assembler,
sentence_detector,
tokenizer,
clinical_embeddings,
ner_model,
ner_converter
))

val data = Seq("Pt is a 61 years old married, Caucasian, Catholic woman. Pt speaks English reasonably well.").toDS.toDF("text")

val result = pipeline.fit(data).transform(data)
```
</div>

## Results

```bash
+-----------------+-----+---+------------+
|ner_label |begin|end|chunk |
+-----------------+-----+---+------------+
|Gender |16 |17 |He |
|Geographic_Entity|43 |47 |Korea |
|Gender |0 |1 |He |
|Family_Member |29 |36 |children |
|Age |8 |19 |61 years old|
|Race_Ethnicity |30 |38 |Caucasian |
|Spiritual_Beliefs|41 |48 |Catholic |
|Gender |50 |54 |woman |
|Language |67 |73 |English |
+-----------------+-----+---+------------+
```

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|ner_sdoh_demographics_wip|
|Compatibility:|Healthcare NLP 4.2.8+|
|License:|Licensed|
|Edition:|Official|
|Input Labels:|[sentence, token, embeddings]|
|Output Labels:|[ner]|
|Language:|en|
|Size:|858.4 KB|

## Benchmarking

```bash
label tp fp fn total precision recall f1
Age 1346.0 73.0 74.0 1420.0 0.948555 0.947887 0.948221
Spiritual_Beliefs 100.0 13.0 16.0 116.0 0.884956 0.862069 0.873362
Family_Member 4468.0 134.0 43.0 4511.0 0.970882 0.990468 0.980577
Race_Ethnicity 56.0 0.0 13.0 69.0 1.000000 0.811594 0.896000
Gender 9825.0 67.0 247.0 10072.0 0.993227 0.975477 0.984272
Geographic_Entity 225.0 9.0 29.0 254.0 0.961538 0.885827 0.922131
Language 51.0 9.0 5.0 56.0 0.850000 0.910714 0.879310
```
Loading