Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FAQ fix #13985

Merged
merged 1 commit into from
Sep 15, 2023
Merged

FAQ fix #13985

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 9 additions & 3 deletions docs/en/concepts.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,16 @@ sidebar:
nav: sparknlp
---

<div class="h3-box" markdown="1">

## Concepts

Spark ML provides a set of Machine Learning applications that can be build using two main components: **Estimators** and **Transformers**. The **Estimators** have a method called fit() which secures and trains a piece of data to such application. The **Transformer** is generally the result of a fitting process and applies changes to the the target dataset. These components have been embedded to be applicable to Spark NLP.

**Pipelines** are a mechanism for combining multiple estimators and transformers in a single workflow. They allow multiple chained transformations along a Machine Learning task. For more information please refer to [Spark ML](https://spark.apache.org/docs/latest/ml-guide.html) library.

</div><div class="h3-box" markdown="1">

## Annotation

The basic result of a Spark NLP operation is an **annotation**. It's structure includes:
Expand All @@ -32,7 +36,7 @@ This object is **automatically generated** by annotators after a transform proce

## Annotators

<div class="h3-box" markdown="1">
</div><div class="h3-box" markdown="1">

Annotators are the spearhead of NLP functions in Spark NLP. There are two forms of annotators:

Expand Down Expand Up @@ -726,7 +730,7 @@ val recursivePipeline = new RecursivePipeline()
))
```

</div></div>
</div></div><div class="h3-box" markdown="1">

### Params and Features

Expand All @@ -737,4 +741,6 @@ we also use Features, which are a way to store parameter maps that are
larger than just a string or a boolean. These features are serialized
as either Parquet or RDD objects, allowing much faster and scalable
annotator information. Features are also broadcasted among executors for
better performance.
better performance.

</div>
7 changes: 5 additions & 2 deletions docs/en/developers.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,14 +11,15 @@ sidebar:
nav: sparknlp
---

<div class="h3-box" markdown="1">

Spark NLP is an open-source library and everyone's contribution is welcome!
In this section we provide a guide on how to setup your environment using IntelliJ IDEA for a smoother start. You can also check our video tutorials available on our YouTube channel: https://www.youtube.com/johnsnowlabs

## Setting up the Environment


<div class="h3-box" markdown="1">
</div><div class="h3-box" markdown="1">

### Import to IntelliJ IDEA

Expand Down Expand Up @@ -196,7 +197,7 @@ You can find created jar in the folder ``spark-nlp/python/lib/sparknlp.jar``

*Note: Assembly command creates a fat jars, that includes all dependencies within*

</div>
</div><div class="h3-box" markdown="1">

### Compiling pypi, whl

Expand All @@ -215,3 +216,5 @@ You can find created `whl` and `tar.gz` in the folder ``spark-nlp/python/dist/``
```
pip install spark_nlp-2.x.x-py3-none-any.whl
```

</div>
24 changes: 9 additions & 15 deletions docs/en/display.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ The ability to quickly visualize the entities/relations/assertion statuses, etc.
The visualisation classes work with the outputs returned by both Pipeline.transform() function and LightPipeline.fullAnnotate().


<br/>
</div><div class="h3-box" markdown="1">

### Install Spark NLP Display

Expand All @@ -37,10 +37,13 @@ You can install the Spark NLP Display library via pip by using:
```bash
pip install spark-nlp-display
```

<br/>

A complete guideline on how to use the Spark NLP Display library is available <a href="https://github.com/JohnSnowLabs/spark-nlp-display/blob/main/tutorials/Spark_NLP_Display.ipynb">here</a>.

</div><div class="h3-box" markdown="1">

### Visualize a dependency tree

For visualizing a dependency trees generated with <a href="https://sparknlp.org/docs/en/annotators#dependency-parsers">DependencyParserApproach</a> you can use the following code.
Expand All @@ -64,7 +67,7 @@ The following image gives an example of html output that is obtained for a test
<img class="image image--xl" src="/assets/images/dependency tree viz.png" style="width:70%; align:center; box-shadow: 0 3px 6px rgba(0,0,0,0.16), 0 3px 6px rgba(0,0,0,0.23);"/>


<br/>
</div><div class="h3-box" markdown="1">

### Visualize extracted named entities

Expand All @@ -89,8 +92,7 @@ The following image gives an example of html output that is obtained for a coupl
<img class="image image--xl" src="/assets/images/ner viz.png" style="width:80%; align:center; box-shadow: 0 3px 6px rgba(0,0,0,0.16), 0 3px 6px rgba(0,0,0,0.23);"/>



<br/>
</div><div class="h3-box" markdown="1">

### Visualize relations

Expand All @@ -112,10 +114,7 @@ The following image gives an example of html output that is obtained for a coupl

<img class="image image--xl" src="/assets/images/relations viz.png" style="width:100%;align:center; box-shadow: 0 3px 6px rgba(0,0,0,0.16), 0 3px 6px rgba(0,0,0,0.23);"/>




<br/>
</div><div class="h3-box" markdown="1">

### Visualize assertion status

Expand All @@ -141,10 +140,7 @@ The following image gives an example of html output that is obtained for a coupl

<img class="image image--xl" src="/assets/images/assertion viz.png" style="width:80%;align:center; box-shadow: 0 3px 6px rgba(0,0,0,0.16), 0 3px 6px rgba(0,0,0,0.23);"/>




<br/>
</div><div class="h3-box" markdown="1">

### Visualize entity resolution

Expand Down Expand Up @@ -172,6 +168,4 @@ The following image gives an example of html output that is obtained for a coupl

<img class="image image--xl" src="/assets/images/resolution viz.png" style="width:100%;align:center; box-shadow: 0 3px 6px rgba(0,0,0,0.16), 0 3px 6px rgba(0,0,0,0.23);"/>



</div><div class="h3-box" markdown="1">
</div>
10 changes: 10 additions & 0 deletions docs/en/examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ permalink: /docs/en/examples
modify_date: "2022-12-21"
---

<div class="h3-box" markdown="1">

Showcasing notebooks and codes of how to use Spark NLP in Python and Scala.

## Python Setup
Expand All @@ -19,6 +21,8 @@ $ conda activate sparknlp
$ pip install spark-nlp==5.1.1 pyspark==3.3.1
```

</div><div class="h3-box" markdown="1">

## Google Colab Notebook

Google Colab is perhaps the easiest way to get started with spark-nlp. It requires no installation or setup other than having a Google account.
Expand All @@ -41,6 +45,8 @@ This script comes with the two options to define `pyspark` and `spark-nlp` versi

[Spark NLP quick start on Google Colab](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp/blob/master/examples/python/quick_start_google_colab.ipynb) is a live demo on Google Colab that performs named entity recognitions and sentiment analysis by using Spark NLP pretrained pipelines.

</div><div class="h3-box" markdown="1"

## Kaggle Kernel

Run the following code in Kaggle Kernel and start using spark-nlp right away.
Expand All @@ -50,7 +56,11 @@ Run the following code in Kaggle Kernel and start using spark-nlp right away.
!wget http://setup.johnsnowlabs.com/kaggle.sh -O - | bash
```

</div><div class="h3-box" markdown="1">

## Notebooks

* [Tutorials and articles](https://medium.com/spark-nlp)
* [Jupyter Notebooks](https://github.com/JohnSnowLabs/spark-nlp/tree/master/examples)

</div>
15 changes: 14 additions & 1 deletion docs/en/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,13 @@ title: Spark NLP - FAQ
permalink: /docs/en/faq
key: docs-faq
modify_date: "2023-09-14"
use_language_switcher: "Python-Scala"
show_nav: true
sidebar:
nav: sparknlp
---

<div class="h3-box" markdown="1">

### How to use Spark NLP?

To use Spark NLP in Python, follow these steps:
Expand Down Expand Up @@ -63,12 +64,16 @@ To use Spark NLP in Python, follow these steps:
6. **Further Reading**:
Dive deeper into the [official documentation](https://sparknlp.org/docs/en/install) for more detailed examples, a complete list of annotators and models, and best practices for building NLP pipelines.

</div><div class="h3-box" markdown="1">

### Is Spark NLP free?

Short answer: 100%! Free forever inculding any commercial use.

Longer answer: Yes, Spark NLP is an open-source library and can be used freely. It's released under the Apache License 2.0. Users can use, modify, and distribute it without incurring costs.

</div><div class="h3-box" markdown="1">

### What is the difference between spaCy and Spark NLP?

Both spaCy and Spark NLP are popular libraries for Natural Language Processing, but Spark NLP shines when it comes to scalability and distributed processing. Here are some key differences between the two:
Expand All @@ -85,6 +90,8 @@ Both spaCy and Spark NLP are popular libraries for Natural Language Processing,
- **Spark NLP**: The core library is open-source under the Apache License 2.0, making it free for both academic and commercial use.
- **spaCy**: Open-source and released under the MIT license.

</div><div class="h3-box" markdown="1">

### What are the Spark NLP models?

Spark NLP provides a range of models to tackle various NLP tasks. These models are often pre-trained on large datasets and can be fine-tuned or used directly for inference. Some of the primary categories and examples of Spark NLP models include:
Expand Down Expand Up @@ -124,6 +131,8 @@ Spark NLP provides a range of models to tackle various NLP tasks. These models a

For the latest list of models, detailed documentation, and instructions on how to use them, visiting the [Official Spark NLP Models Hub](http://sparknlp.org/models) would be beneficial.

</div><div class="h3-box" markdown="1">

### What are the main functions of Spark NLP?

Spark NLP offers a comprehensive suite of functionalities tailored for natural language processing tasks via large language models. Some of the main functions and capabilities include:
Expand Down Expand Up @@ -183,6 +192,8 @@ Spark NLP is designed to be highly scalable and can handle large-scale text proc

To fully grasp the breadth of functions and learn how to use them, users are encouraged to explore the [official Spark NLP documentation](https://nlp.johnsnowlabs.com/docs/en/quickstart).

</div><div class="h3-box" markdown="1">

### Where can I get prebuilt versions of Spark NLP?

Prebuilt versions of Spark NLP can be obtained through multiple channels, depending on your development environment and platform:
Expand Down Expand Up @@ -218,3 +229,5 @@ Prebuilt versions of Spark NLP can be obtained through multiple channels, depend
Apart from the library itself, Spark NLP provides a range of pre-trained models and pipelines. These can be found on the [Spark NLP Model Hub](https://sparknlp.org/models).

Always make sure to consult the [official documentation](https://sparknlp.org/docs/en/quickstart) or the [GitHub repository](https://github.com/JohnSnowLabs/spark-nlp/) for the latest instructions and versions available.

</div>
1 change: 1 addition & 0 deletions docs/en/hardware_acceleration.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,3 +104,4 @@ In future TensorFlow releases, the oneDNN will be enabled by default (starting T


[Webinar: Speed Optimization & Benchmarks in Spark NLP 3: Making the Most of Modern Hardware](https://www.johnsnowlabs.com/watch-webinar-speed-optimization-benchmarks-in-spark-nlp-3-making-the-most-of-modern-hardware/)
</div>
Loading